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Preface 





This book serves two purposes, the first is as a text and the second is for 
someone wishing to explore topics not found in other automata theory texts. 
It was originally written as a text book for anyone seeking to learn the basic 
theories of automata, languages, and Turing machines. In the first five chapters, 
the book presents the necessary basic material for the study of these theories. 
Examples of topics included are: regular languages and Kleene’s Theorem; 
minimal automata and syntactic monoids; the relationship between context-free 
languages and pushdown automata; and Turing machines and decidability. The 
exposition is gentle but rigorous, with many examples and exercises (teachers 
using the book with their course may obtain a copy of the solution manual by 
sending an email to solutions @cambridge.org). It includes topics not found in 
other texts such as codes, retracts, and semiretracts. 

Thanks primarily to Tom Head, the book has been expanded so that it should 
be of interest to people in mathematics, computer science, biology, and possibly 
other areas. Thus, the second purpose of the book is to provide material for 
someone already familiar with the basic topics mentioned above, but seeking 
to explore topics not found in other automata theory books. 

The two final chapters introduce two programs of research not previously 
included in beginning expositions. Chapter 6 introduces a visually inspired 
approach to languages allowed by the unique representation of each word as a 
power of a primitive word. The required elements of the theory of combinatorics 
on words are included in the exposition of this chapter. This is an entirely fresh 
area of research problems that are accessible on the completion of Chapter 6. 
Chapter 7 introduces recently developed language theory that has been inspired 
by developments in the biomolecular sciences and DNA computing. Both of 
these final chapters are kept within automata theory through their concentration 
on results in regular languages. Research in progress has begun to extend these 
concepts to broader classes of languages. There are now specialized books on 
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DNA-computing — and in fact a rapidly growing Springer-Verlag Series on 
‘Natural Computing’ is in progress. This book is the first one to link (introduc- 
tory) automata theory into this thriving new area. 

Readers with a strong background will probably already be familiar with 
the material in Chapter 1. Those seeking to learn the basic theory of automata, 
languages, and Turing machines will probably want to read the chapters in order. 
The sections on retracts and semiretracts, while providing interesting examples 
of regular languages, are not necessary for reading the remainder of the book. 

A person already familiar with the basics of automata, languages, and Turing 
machines, will probably go directly to Chapters 6 and 7 and possibly the sections 
on retracts and semiretracts. 

I thank Tom Head for the work he has done on this book including his 
contributions of Chapters 6 and 7 as well as other topics. I also thank Brett 
Bernstein for his excellent proofreading of an early version of the book and 
Kristin and Phil Muzik for creating the figures for the book. Finally I would 
like to thank Ken Blake and David Tranah at Cambridge University Press for 
their help and support. 
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Introduction 


1.1 Sets 


Sets form the foundation for mathematics. We shall define a set to be a well- 
defined collection of objects. This definition is similar to the one given by 
Georg Cantor, one of the pioneeers in the early development of set theory. The 
inadequacy of this definition became apparent when paradoxes or contradictions 
were discovered by the Italian logician Burali-Forti in 1879 and later by Bertrand 
Russell with the famous Russell paradox. It became obvious that sets had to 
be defined more carefully. Axiomatic systems have been developed for set 
theory to correct the problems discussed above and hopefully to avoid further 
contradictions and paradoxes. These systems include the Zermelo-Fraenkel- 
von Neumann system, the Gédel—Hilbert-Bernays system and the Russell- 
Whitehead system. In these systems the items that were allowed to be sets were 
restricted. Axioms were created to define sets. Any object which could not be 
created from these axioms was not allowed to be a set. These systems have 
been shown to be equivalent in the sense that if one system is consistent, then 
they all are. However, Gödel has shown that if the systems are consistent, it is 
impossible to prove that they are. 


Definition 1.1 An object in a set is called an element of the set or is said to 
belong to the set. If an object x is an element of a set A, this is denoted by 
x € A. If an object x is not a member of a set A, this is denoted by x ¢ A. 


Objects in a set are called elements. Finite sets may be described by listing 
their elements. For example the set of positive integers less than or equal to 
seven may be described by the notation {1, 2,3, 4,5, 6, 7} where the braces 
are used to indicate that we are describing a set. Thus symbols in an alphabet can 
be listed using this notation. We can also list the set of positive integers less than 
or equal to 10 000, by using the notation {1, 2,3, 4, ..., 10000} and the set of 
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positive integers by {1, 2, 3, 4, ...}, where three dots denote the continuation of 
a pattern. By definition, 1 € {1, 2, 3, 4, 5} but 8 ¢ {1, 2, 3, 4, 5}. An element of 
a set may also be a set. Therefore A = {1, 2, {3, 4, 5}, 3, 4} is a set that contains 
elements 1, 2, {3, 4,5}, 3, and 4. Note that 5 ¢ A, but {3, 4,5} € A. 

In many cases, listing the elements of a set can be tedious if not impossible. 
For example, consider listing the set of all primes. We thus have a second form 
of notation called set builder notation. Using this notation, the set of all objects 
having property P will be described by {x : x has property P}. For example 
the set of all former Prime Ministers of Britain would by described by {x : x 
has been a Prime Minister of Britain}. The set of all positive even integers less 
that or equal to 100, could be described by {x : x is a positive even integer less 
than or equal to 100}. 


Definition 1.2 A set A is called a subset of a set B if every element of the set 
A is an element of the set B. If A is a subset of B, this is denoted by A C B. If 
A is not a subset of B, this is denoted by A É B. 


Therefore {a, b, c} C {a, b, c, d, e} but {a, b, f} g {a, b, c,d, e}. By defi- 
nition, any set is a subset of itself. 


Definition 1.3 A set A is equal to a set B if A C B and B C A. 


Therefore two sets are equal if they contain the same elements. Notice that 
there is no order in a set. A set is simply defined by the elements that it contains. 
Also an element either belongs to a set or does not. It would be redundant to 
list an element more than once when defining a set. 


Definition 1.4 The intersection of two sets A and B, denoted by A N B, is the 
set consisting of all elements contained in both A and B. 


Let A = {x : x plays tennis} and B = {x : x plays golf}, then AN B = {x : 
x plays tennis and golf}. If A = {x : x is a positive integer divisible by 3} and 
B = {x : x is a positive integer divisible by 2}, then A N B = {x : x is a positive 
integer divisible by 6}. 


Definition 1.5 The union of two sets A and B, denoted by A U B, is the set 
consisting of all elements contained in either A or B. 


Let A = {x : x plays tennis} and B = {x : x plays golf}, then A U B = {x : 
x plays tennis or golf}. 

If A = {x : x is a positive integer divisible by 3} and B = {x : x is a positive 
integer divisible by 2}, then A U B = {x : x is a positive integer divisible by 
either 2 or 3}. 
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Definition 1.6 The set difference, denoted by B — A, is the set of all elements 
in the set B that are not in the set A. 


For example, the set {1, 2, 3, 4, 5} — {2, 4, 6, 8, 10} = {1, 3, 5}. 


Example 1.1 Let A = {x : x plays tennis} and B = {x : x plays golf}, the set 
A — B = {x : x plays tennis but does not play golf}. 


Definition 1.7 The symmetric difference, denoted by A A B, is the set 
(A — B)U(B — A). 


It is easily seen that A A B = (AU B) — (ANB). 


Example 1.2 Let A = {x : x plays tennis} and B = {x : x plays golf}, the set 
A A B = {x : x plays tennis or golf but not both}. 


We define two special sets. The first is the empty set, which is denoted 
by Ø or {}. As the name implies, this set contains no elements. It is a subset 
of every set A since every element in the empty set is also in A. The second 
special set is the universe or universe of discourse, which we denote by U. 
The universe is given, and limits or describes the type of sets under discussion, 
since they must all be subsets of the universe. For example if the sets we are 
describing are subsets of the integers then the universe could be the set of 
integers. If the universe is the the set of college students, then the set {x : x 
is a musician} would be the set of all musicians who are in college. Often the 
universe is understood and so is not explicitly mentioned. Later we shall see 
that the universe of particular interest to us is the set of all strings of symbols 
in a given alphabet. 


Definition 1.8 Let A be a set. A! =U — Ais the set of all elements not in A. 


Example 1.3 Let A be the set of even integers and U be the set of integers. 
Then A’ is the set of odd integers. 


Example 1.4 Let A = {x : x collects coins}, then A’ = {x : x does not collect 
coins}. 


The proof of the following theorem is left to the reader. 
Theorem 1.1 Let A, B, and C be subsets of the universal set U 
(a) Distributive properties 


AN(BUC)=(ANB)U(ANC), 
AU(BNC)=(AUB)N(AUC). 
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(b) Idempotent properties 


ANA=A, 
AUA=A,. 


(c) Double Complement property 
(A'Y =A. 
(d) De Morgan’s laws 
(AUBY=A'NB’, 
(AN BY = AUB’. 
(e) Commutative properties 


ANB= BAA, 
AUB=BUA. 


(f) Associative laws 


AN(BNC) =(ANB)NC, 
AU(BUC) =(AUB)UC. 


(g) Identity properties 


AU = A, 

ANU =A. 
(h) Complement properties 

AUA =U, 

ANA’ =4%. 


Definition 1.9 The size or cardinality of a finite set A, denoted by |A], is the 
number of elements in the set. An infinite set which can be listed so that there is 
a first element, second element, third element etc. is called countably infinite. 
If it cannot be listed, it is said to be uncountable. Two infinite sets have the 
same cardinality if there is a one-to-one correspondence between the two sets. 
We denote this by |A| = |B|. If there is a one-to-one correspondence between 
A and a subset of B, we denote this by |A| < |B|. If |A| < |B| but there is no 
one-to-one correspondence between A and B, then we denote this by |A| < |B]. 


Thus the cardinality of the set {a, b, c, {d, e, f}} is 4. Intuitively, there is a 
one-to-one correspondence between two sets if elements of the two sets can be 
written in pairs so that each element in one set can be paired with one and only 
one element of the other set. The positive integers are obviously countable. 
Although it will not be proved here, the integers and rational numbers are 
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both countable sets. The real numbers however are not a countable set. We 
see that there are two infinite sets, the countable sets and the uncountable sets 
with different cardinality; however, we shall soon see that there are an infinite 
number of infinite sets of different cardinality. 

Further discussion of cardinality will be continued in the appendices. 


Definition 1.10 Let A and B be sets. The Cartesian product of A and B, 
denoted by A x B is the set {(a, b) :a € A andb € B}. 


For example, let A = {a, b} and B = {1, 2, 3}, then 
A x B = {(a, 1)(a, 2)(a, 3)(b, 1)(b, 2)\(b, 3)}. 


The familiar Cartesian plane R x R is the set of all ordered pairs of real numbers. 
Note that for finite sets |A x B| = |A| x |B]. 


Definition 1.11 The power set of a set A, denoted by P(A), is the set of all 
subsets of A. 


For example the power set of {a, b, c} is 
{{a}, {b}, {c}, {a, b}, {a, c}, {b, c}, {a, b, c}, Ø}. 


In the finite case, it can be easily shown that |P(A)| = 2!4l, 


Exercises 


(1) State which of the following are true and which are false: 
(a) {Ø} C A for an arbitrary set A. 
(b) Ø C A for an arbitrary set A. 
(c) {a,b,c} C {a, b, {a, b, c}}. 
(d) {a,b,c} € {a, b, {a, b, c}}. 
(e) A € P(A). 
(2) Prove Theorem 1.1. Let A, B, and C be subsets of the universal set U. 
(a) Idempotent property 
ANA=A, 
AUA=<A. 


(b) Double Complement property 
(A'Y =A. 
(c) De Morgan’s laws 


(AU BY = A'OB’, 
(AN BY = A'UB’. 
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(d) Commutative properties 


ANB=BNA, 
AUB=BUA. 


(e) Associative properties 
AN(BNC)=(ANB)NC, 
AU(BUC)=(AUB)UC. 

(f) Distributive properties 

AN(BUC)=(AN B)U(ANC), 
AU(BNC)=(AUB)N(AUC). 


(g) Identity properties 


AUG=A, 

ANU =A. 
(h) Complement properties 

AUA =U, 

ANA = ŬÕ. 


(3) Given a set A € P(C), find a set B such that A A B = Ø. 

(4) If A C B, whatis A A B? 

(5) Using the properties in Theorem 1.1 prove that AN(BAC)= 
(AN B) A(ANC). 

(6) Use induction to prove that for any finite set A, |A| < |P(A)|. 

(7) (Russell’s Paradox) Let S be the set of all sets. Then S € S. Obviously 
Ø Ø. Let W = {A : A ¢ A}. Discuss whether W € W. 

(8) Prove using the properties in Theorem 1.1 
(a) A—(BUC)=(A- B)N(A- QO), 
(b) A—(BNC)=(A— B)U(A—-C). 

(9) Use the fact that A N (A U B) = A to prove that A U (A N B) = A. 

(10) Prove that if two disjoint sets are countable, then their union is countable. 


1.2 Relations 


Definition 1.12 Given sets A and B, any subset R of A x B is a relation 
between A and B. If (a, b) € R, this is often denoted by aRb. If A= B, R is 
said to be a relation on A. 
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Note that relations need not have any particular property nor even be describ- 
able. Obviously we will be interested in those relations which are describable 
and have particular properties which will be shown later. 


Example 1.5 If A = {a, b,c, d, e} and B = {1, 2, 3, 4, 5}, then 
{(a, 3), (a, 2), (c, 2), (d, 4), (e, 4), (e, 5)} 
is a relation between A and B. 
Example 1.6 {(x, y): x > y} and {(x, y) : x? + y? = 4} are relations on R. 


Example 1.7 If A is the set of people, then aRb if a and b are cousins is a 
relation on A. 


Definition 1.13 The domain of a relation R between A and B is the set 
{a :a € A and there exists b € B so that aRb}. The range of a relation R 
between A and B is the set {b : b € B and there exists a € A so that aRb}. 


Example 1.8 The domain and range of the relation {(x, y) : x? + y? = 4} are 
—2 < x <2 and —2 < y < 2 respectively. 


Example 1.9 The relation R is on the set of people. The domain and range 
of R is the set of people who have cousins. 


Definition 1.14 Let R be a relation between A and B. The inverse of the 
relation R denoted by R™ is a relation been B and A, defined by R! = 
{(b, a) : (a, b) € R}. 


Example 1.10 If A = {a, b, c,d, e} and B = {1, 2, 3, 4, 5}, and 
R = {(a, 3), (a, 2), (b, 3), (b, 5), (c, 3), (d, 2), (d, 3), (e, 4), (e, 5)} 
is a relation between A and B then 
R = {G,a), (2, a), (3, b), (5, b), G, c), (2, d), (3, d), (4, e), (5, e)} 
is a relation between B and A. 
Example 1.11 If R={(x, y) : y = 4x?), then R7'={(y, x) : y = 4x?}. 


Definition 1.15 Let R be a relation between A and B, and let S be a relation 
between B and C. The composition of R and S, denoted by S o R is a relation 
between A and C defined by (a,c)E€ So R if there exists b € B such that 
(a,b) € Rand(b,c)E€ S. 


Example 1.12 Let A = {a, b, c, d, e} and B = {1, 2, 3, 4, 5} and 
R = {(a, 3), (a, 2), (c, 2), (d, 4), (e, 4), (e, 5)} 
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be a relation between A and B. Then, as shown above 
RT! = {(3,a), (2,4), (2, c), (4, d), (4, e), (5, e)} 
is a relation between B, and A, 
R o R! = {(3, 3), (3, 2), (2, 2), (2, 3), (4, 4), (5, 5)} 
is a relation on B, and 
R'oR = {(a, a), (a,c), (c, a), (c, c), (d, d), (d, e), (e, e)} 
is a relation on A. 


Example 1.13 If R = {(x, y) : y = x + 5} and S = {(y, z) : z = y?} then 
So R = {(x, z) : z = (x +5)’}. 


Theorem 1.2 Composition of relations is associative; that is, if A, B, and C 
are sets and if R C A x B, S C B x C, andT CC x D, then T o (S o R) = 
(ToS)oR. 


Proof First show that T o (So R)S (T o S)o R. Let (a,d)ET o(S 0 R), 
then there exists c € C such that (a, c) € So R and (c, d) € T. Since (a, c) € 
So R, there exists b € B so that (a, b) € R and (b,c) € S. Since (b,c) € S 
and (c,d) € T, (b,d) €T o S. Since (b,d) € T o S and (a,b) € R, (a,d) € 
(T o S)o R. Thus, To(So R)C (T o S)o R. The second part of the proof 
showing that (T o S)o R C T o (S o R) is similar and is left to the reader. 














When œR is a relation on a set A, there are certain special properties that R 
may have which we now consider. 


Definition 1.16 A relation R on A is reflexive ifaRa foralla € A. A relation 
R on A is symmetric if aRb — bRa for all a,b € A. A relation R on A 
is antisymmetric if aRb and bRa implies a = b. A relation is transitive if 
whenever aRb and bRc, then aRc. 


Example 1.14 Let A be the set of all people and aRb if a and b are siblings. 
The relation ® is not reflexive since a person cannot be their own brother or 
sister. It is symmetric however since if a and b are siblings, then b and a are 
siblings. It might appear that R is transitive. Such is not the case however since 
if a and b are siblings, and b and a are siblings, we must conclude that a and a 
are siblings, which we know is not true. 


Example 1.15 Let A be the set of all people and aRb if a and b have the 
same parents. The relation R is reflexive since everyone has the same parents 
as themselves. It is symmetric since if a and b have the same parents, b and 
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a have the same parents. It is also transitive since if a and b have the same 
parents and b and c have the same parents, then a and c have the same parents. 


Example 1.16 Let A = {a, b, c, d, e} and 
R={(a, a), (a, b), (b, c), (b, b), (a, c), (c, c), (d, d), (a, d), (c, e), (d, a), (b, a)}. 


R is not reflexive since (e, e) ¢ R. It is not symmetric because (a, c) € R, but 
(c, a) ¢ R. It is not antisymmetric since (a, d), (d, a) € R, but d Æ a. It is not 
transitive since (a, c), (c, e) € R, but (a, e) ¢ R. 


Example 1.17 Let ® be the relation on Z defined by aRb ifa — b is a multiple 
of 5. Certainly a — a = 0 is a multiple of 5, so R is reflexive. If a — b is a 
multiple of 5, then a — b = 5k for some integer k. Hence b — a = 5(—k) is a 
multiple of 5, so R is symmetric. If a — b is a multiple of 5 and b — c is a 
multiple of 5, then a — b = 5k and b — c = 5m for some integers k and m. 





a—c=a—b+b-c 
= 5k + 5m 
= S5(k +m) 


so that a — c is a multiple of 5. Hence R is transitive. 


Definition 1.17 A relation R on A is an equivalence relation ifit is reflexive, 
symmetric, and transitive. 


Example 1.18 Let Z be the set of integers and Rı be the relation on Z defined 
by Rı = {(m, n) : m — n} is divisible by 5. Rı is shown above to be an equiv- 
alence relation on the integers. 


Example 1.19 Let A be the set of all people. Define R2 by aR2b if a and b 
are the same age. This is easily shown to be an equivalence relation. 


An equivalence relation on a set A divides A into nonempty subsets that are 
mutually exclusive or disjoint, meaning that no two of them have an element 
in common. In the first example above, the sets 


20, —15, — 10, —5, 0,5, 10, 15, 20, ...} 


fac 
{..— 19, —14, —9, —4, 1,6, 11, 16 cae 
{..— 18, —13, —8, —3, 2, 7, 12, 17, 22, .. 
{ 
{ 





... — 17, —12, —7, —2, 3, 8, 13, 18, 23,.. 
... — 18, —11, —6, —1, 4, 9, 14, 19, 24,... 


KR eS YS HS 


contain elements that are related to each other and no element in one set is 
related to an element in another set. In the second example the sets {s, = x : x 
is n years old} for n = 0, 1, 2, ... also divide the set of people into sets that are 
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related to each other. Also no person can belong to two sets. (See the definition 
of partition below.) 


Notation 1.1 Let R be an equivalence relation on a set A and a € A. Then 
[alr = {x : xRa}. If the relation is understood, then [a] is simply denoted by 
[a]. Let [A]r = {[a]r : a € A}. 


Definition 1.18 Let A and I be nonempty sets and (A) = {A; :i € I} bea 
set of nonempty subsets of A. The set (A) is called a partition of A if both of 
the following are satisfied: 


(a) A; 1A; = 9 for alli F j. 
(b) A = | A;; that is, a € A ifand only ifa € A; for somei € I. 
iel 
Theorem 1.3 A nonempty set of subsets (A) of a set A is a partition of A if 
and only if (A) = [A]r for some equivalence relation R. 


Proof Let (A) = {A; : i € I} bea partition of A. Define a relation R on A by 
aRb if and only if a and b are in the same subset A; for some i. Certainly for 
all a in A, aRa and F is reflexive. If a and b are in the same subset A;, then b 
and a are in the subset A; and R is symmetric. Since the sets A; N A; = Ø for 
i Æ j,ifa and b are in the same subset and b and c are in the same subset, then 
a and c are in the same subset. Hence œR is transitive and R is an equivalence 
relation. 

Conversely, assume that R is an equivalence relation. We need to show that 
[A]r = {[a]: a € A} is a partition of A. Certainly, for all a, [a] is nonempty 
since a € [a]. Obviously, A is the union of the [a], such that a € A. Assume 
that [a] N [b] is nonempty and let x € [a] N [b]. Then xRa and xRb, and by 
symmetry, aRx. But since aRx and xRb, by transitivity, aRb. Therefore, 
a € [b]. If y €e [a], then yRa and since aRb, by transitivity, yRb. There- 
fore, [a] € [b]. Similarly, [b] € [a] so that [a] = [b], and we have a partition 
of A. 














Definition 1.19 [A], is called the set of equivalence classes of A given by 
the relation R. 


If the symmetric property is changed to antisymmetric property, we have the 
following: 


Definition 1.20 A relation R on A is a partial ordering if it is reflexive, 
antisymmetric, and transitive. If R is a partial ordering on A, then (A, R) is 
called a partially ordered set or a poset. 
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Example 1.20 Let A be collection of subsets of a set S. Define the relation 
<by U < V if U C V. It is easily seen that (A, C ) is a partially ordered set. 


Example 1.21 Let R be the set of real numbers. Define the relation < by 
r < s ifr is less than or equal to s using the usual ordering on R. 


Definition 1.21 Let (A, <) be a partially ordered set. If a,b € A and either 
a <borb <a thena and bare said to be comparable. If for every a,b € A, 
a and b are comparable then (A, <) is called a chain or a total ordering. 


Definition 1.22 For a subset B of a poset A, an element a of A is an upper 
bound of B ifb < a (ora > b) for all b in B. The element a is called a least 
upper bound (lub) of B if (i) a is an upper bound of B and (ii) if any other 
element a’ of A is an upper bound of B, thena < a'. The least upper bound for 
the entire poset A (if it exists) is called the greatest element of A. For a subset 
B of a poset A, an element a of A is a lower bound of B ifa < b (or b > a) 
for all b in B. The element a is called a greatest lower bound (glb) of B if (i) 
a is a lower bound of B and (ii) if any other element a'of A is a lower bound 
of B, then a > a'. The greatest lower bound for the entire poset A (if it exists) 
is called the least element of A. 


Example 1.22 Let C = {a, b, c} and X be the power set of C. 
X = P(C) = {Ø, {a}, {b}, {c}, {a, b}, {a,c}, {b,c}, {a, b, c}}. 


Define the relation < on X by T < V if T C V. By definition, {a, b} is the 
greatest lower bound of {@, {a}, {b}} and also of {@, {a}, {b}, {a, b}}. The set 
{a, b, c} is the least upper bound of X. The element Ø is the greatest lower 
bound for all three sets. 


Definition 1.23 A poset A for which every pair of elements of A have a least 
upper bound in A is called an upper semilattice and is denoted by (A, V) or 
(A, +). 


If every two elements of a poset A have a greatest lower bound in A, then 
the following binary relation can be defined on the set. If a and b belong to A, 
leta ^ b = glb{a, b}. 


Definition 1.24 A poset A for which every pair of elements of A have a greatest 
lower bound in A is called a lower semilattice and is denoted by (A, ^ ) or 
(A, -). 


Example 1.22 is an example of a poset which is both an upper semilattice 
and a lower semilattice. 
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Exercises 


(1) What is wrong with the following proof? 

Ifa relation R on a set A is symmetric and transitive, then it is reflexive. 
Proof Since R is symmetric, if a Rb then bRa. Since A is transitive, 
if aRb and bRa then aRa. Therefore aRa and R is reflexive. 

(2) Give an example of a relation R on a set A that is reflexive and symmetric, 
but not transitive. 

(3) Give an example of a relation R on a set A that is reflexive and transitive, 
but not symmetric. 

(4) Let o and t be relations of a set A. Show that o C t if and only if each 
equivalence class in the set of equivalence classes given by t is a union of 
equivalence classes given by o. 

(5) A relation R of A is a partial order if it is reflexive, antisymmetric, and 
transitive. It is a total order or chain if for any two elements a,b € A, 
either aRb or bRa. Give an example of a partial order that is not a total 
order. 

(6) Prove that the intersection of two partial orders on a set A is a partial order. 

(7) Prove that the intersection of two equivalence relations on a set A is an 
equivalence relation. 

(8) Given a set A, what is the intersection of all equivalence relations on A? 

(9) Let A be the set of ordered pairs of positive integers. Define the relation R 
on A by (a, b)R(c, d) if ad = bc. Is R an equivalence relation? If so what 
are the equivalence classes? 


1.3 Functions 


Definition 1.25 A relation f on A x B is a function from A to B, denoted 
by f : A— B, if for every a € A there is one and only one b € B so that 
(a,b) € f. If f : A — B is a function and (a,b) € f, we say that b = f(a). 
The set A is called the domain of the function f and B is called the codomain. 
IfE CA, then f(E) = {b : f(a) = b for some a in E} is called the image of 
E. The image of A itself is called the range of f. If F © B, then f~'(F) = {a : 
f(a) € F} is called the preimage of F. A function f : A —> B is also called a 
mapping and we speak of the domain A being mapped into B by the mapping 
f. If (a,b) € f so that b = f(a), then we say that the element a is mapped to 
the element b. 


The proof of the following theorem is left to the reader. 
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Theorem 1.4 Let f:A— B. 


(a) f(A, U A2) = f(A1)U f(A2) for A1, A2 C A. 

(b) fT'(Bi U B2) = f7'(B1) U f—'(B2) for Bi, Bz C B. 
(c) f(A N A2) € f(A1)N f(A2) for A1, Az C A. 

(d) f7'(Bi N Bo) = f7'(B1) N f7! (B2) for Bi, Bz C B. 
(e) f7'(B{) = (f7'(B1)Y for Bı C B. 


Definition 1.26 Jf f : A —> B, and the image of f is B, then f is onto. It is 
also called a surjection or an epimorphism. Thus for element b € B, there is 
an elementa € A so that b = f(a). 


Definition 1.27 Iff: A > B, and f(a) = f(a) => a = d' foralla,a' € A 
then f is one-to-one. It is also called a monomorphism or injection. 


Definition 1.28 If f : A —> B is one-to-one and onto, then f is called a 
one-to-one correspondence or bijection. If A is finite, then f is also called a 
permutation. 


Notation 1.2 If f is a permutation on the set {1, 2, 3, ..., n}, then it can be 
represented in the form 


( 1 2 ... n ) 
fD fD see FO 
Thus if f(a) = b, f(b) = d, f(c) =a, and f(d) = c, we may denote this by 


abcd 1234 1234 
i d a E 4 1 3) ana = (5 3 4 |): fin 


the composition f o g note that since g(1) = 2 appears under 1 in the permuta- 
tion for g, and f(2) = 4, appears under 2 in f, we may find (f o g)(1) by going 
from 1 down to 2 in g and then going from 2 down to 4 in the permutation f, so 
(f o g)C) = 4. Similarly, to find (f o g)(2), go down from 2 to 3 in g and from 
3 to lin f, so(f o g)(2) = 1. Continuing, we have f o g = P J 
Example 1.23 Let f : A —> B, where A and B are the set of real numbers, be 
defined by f(x) = x?, then f is a function whose range is the set of nonnegative 
real numbers. It is not onto since the range is not B. It is not one-to-one since 


fQ= fC) =4. 


Example 1.24 Let f : A —> B, where A and B are the set of real numbers, 
be defined by f(x) = x°, then f is a function whose range is B. Hence it is 
onto. It is also one-to-one since a? = (a? > a =a’. 
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Definition 1.29 Let I : A — A be defined by I(a) =a for all a € A. The 
function I is called the identity function. 


Definition 1.30 Let g:A— Band f : B — C, then (f og\(x)= f(g(x)). 
The proof of the following theorem is elementary and is left to the reader: 
Theorem 1.5 Let f: A — A, thenIlof=fol=f. 


Theorem 1.6 Let f : A— B then there exists a function f7! : B —> A so 
that f o f7! = f7! o f = I ifand only if f is a bijection. The function f~! 
is also a bijection. 


Proof Assume there exists a function f~! : B —> A so that fo f7! = f7! o 
f =I, and f(a) = f(a’). Then f7! o f(a) = fo! o f(a’), so I(a) = I(a’) 
anda = a’. Therefore f is one-to-one. Letb € Banda = f~'(b). Then f(a) = 
ff '()) = b, f is onto. 

Assume f : A —> B is a bijection. Define the relation f7! on B x A by 
fib) =a if f(a) = b. Let b € B and choose a so that f(a) = b. This is 
possible since f is onto. Therefore f~'(b) =a and f~! has domain B. If 
f-'(b) =a and f-'(b) =a’, then f(a) = b and f(a’) = b. But since f is 
one-to-one, a = a’. Therefore f7! is well defined and hence f~! is a function. 
By definition f o f! = f-lo f=I. 

By symmetry, f7! is a bijection. 














The proof of the following theorem is left to the reader: 
Theorem 1.7 Let g:A— Band f : B > C; then: 


(a) If g and f are onto B and C, respectively, then f o g is onto C. 

(b) If g and f are both one-to-one, then f o g is one-to-one. 

(c) If g and f are both one-to-one and onto, then f o g is one-to-one and onto. 
(d) If g and f have inverses, then (f o g)! = g7!0 ful. 


Theorem 1.8 Let f : A —> B be a function. The relation R defined by a Ra' 
if f(a) = f(a’) is an equivalence relation. 


Proof Leta,a’,a” € A. Certainly f(a) = f(a) so R is reflexive. If f(a) = 
f(a’), then f(a’) = f(a), so R is symmetric. If f(a) = f(a’) and f(a’) = 
f(a"), then f(a) = f(a") so R is transitive. Therefore R is an equivalence 
relation. 














Definition 1.31 Let R be an equivalence relation on A, andor : A > [A]r 
be a function defined by @p(a) = [a]. The function op is called the canonical 
function from A to [A]p. 
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Theorem 1.9 Let f : A —> B be a function and R be the relation aRa’ iff 
f(a) = f(a’, then there exists a function g : [A]r —> B defined by g(ar) = 
f(a). Hence g o bp = f. 


A-B 
r| Pi 
[Al 


Proof Assume g([a]) = b and g([a]) = b’, then f(a) =b and f(a’) = b', 
where [a] = [a’]. But then aRa’ and f(a) = f(a’). Therefore b = b’ and g is 
a function. 














Theorem 1.10 Let f : [Ale — B be a function and S be an equivalence 
relation such that S C R and aSa' > aRa’, then there exist functions g : 
[A]s —> Bandi: [Als —> [A]r such that f oi = g. 


Proof Leti :[A]s — [A]r be defined by i([a]s) = [a]r and g : [Als > B 
by g([a]s) = f([ar]). The function i is trivially well defined. The proof that g 
is a function is similar to the proof of the previous theorem. 














Exercises 


(1) Prove Theorem 1.4. Let f : A —> B. 
(a) f(A1 U A2) = f(A) U f(A2) for A1, A2 C A. 
(b) f~'(B, U B2) = f7'(B1) U f7'(B2) for By, By C B. 
(c) f(A N A2) © F(A) N f2) for A1, A2 C A. 
(d) f7'(B. N B2) = f7'(B1) N f7! (B2) for By, By C B. 
(©) f7'(B{) = (f7'(B1)) for Bı C B. 
(2) Prove Theorem 1.7. Let g : A —> B and f : B > C; then: 
(a) If g and f are onto B and C, respectively, then f o g is onto C. 
(b) If g and f are both one-to-one, then f o g is one-to-one. 
(c) If g and f are both one-to-one and onto, then f o g is one-to-one and 
onto. 
(d) If g and f have inverses, then (f o g)! = g7! o f7. 
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(3) Give an example of a function f and sets Aj, A2 C A such that f(A 9 
A2) Æ f(A) f(A2). 

(4) Prove that if f o g is one-to-one then g is one-to-one. 

(5) Prove that if f o g is onto, then f is onto. 


1.4 Semigroups 


In the following function x» : S x S — S we shall use the notation a « a’ for 


«((a, a’)). 


Definition 1.32 A semigroup is a nonempty set S together with a function x 
from S x S — S such that 


a x (a' xa") = (a xa’) x a". 


The function or operation x with this property is called associative. The semi- 
group is denoted by (S, x) or simply S if the operation is understood. If S 
contains an identity element | such that 1 xa = a x 1 = a foralla € A, then S 
is called a monoid. If S contains an element 0 such that O x a = a x 0 = 0 for 
alla € A, then S is called a semigroup with zero. A semigroup is commutative 
ifa xa =d xa foralla,a' € A. 


Example 1.25 Examples of semigroups include 


(1) The set of integers [positive integers, real numbers, positive real numbers, 
rational numbers, positive rational numbers] together with either of the 
operations addition or multiplication is a semigroup. 

(2) The set of functions {f | f : A —> A} for a given set A, together with the 
operation o where (f o g)(x) = f(e(x)) is a semigroup. 

(3) The set of n x n matrices with either of the operations addition or multi- 
plication is a semigroup. 


Example 1.26 The set of nonnegative integers together with the operation 
addition is a monoid. All of the above examples except for the positive real 
numbers, positive integers, and positive rational numbers with the operation 
addition form a monoid. 


Every semigroup S can be changed to a monoid by simply adding an element 
1 and defining 1 xa =a x1 =a for all a € S. If S was already a monoid, it 
remains a monoid but with a different identity element. Normally one adds an 
identity element to a semigroup only if it is not already a monoid. Similarly 
every semigroup S can be changed to a semigroup with zero by simply adding 
an element 0 and define 0 x a = a x 0 = 0 for alla € S. Note that if we let Sm 
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be the set of all integers greater than or equal to m, then (Sm, +) is a semigroup. 
If we include 0, then we have a monoid. Also since (Sm, -) is a semigroup, if 
we include 1, then we have a monoid. 


Notation 1.3 Let S be a semigroup. The monoid S! = S if S is already a 
monoid, and S! = S U {1} otherwise. The semigroup S? = S if S is already a 
semigroup with zero and S? = S U {0} otherwise. 


Definition 1.33 Let (S, x) be a semigroup and H be a nonempty subset of S. If 
forallh,h' € H, hxh' € H, then (H, x) is a subsemigroup of (S, x). If (S, x) 
is a monoid and (H, x) is a subsemigroup of (S, x) containing the identity of 
the monoid, then (H, x) is a submonoid of (S, x). 


Therefore the set of positive integers with the operation multiplication is a 
submonoid of the integers with the operation multiplication. The semigroup 
(Sm, +) is a subsemigroup of (Sn, +) form <n. 


Theorem 1.11 Let(S, x) be a semigroup and {H; : i € I} be subsemigroups 
of S. If the intersection (| H; is nonempty, then it is a subsemigroup of S. 
iel 


Proof Leth, h’ € () Hi. Then h, h' € H; for each i € I and h x h’ € H; for 





iel 
each i. Therefore h * h’ € () H;, and (| H; is a subsemigroup of S. 
iel iel 
Corollary 1.1 Let (S, x) be a monoid and {H; : i € I} be submonoids of I. 


The intersection (| H; is a submonoid. 
iel 











Theorem 1.12 Let (S, x) be a semigroup and W be a nonempty subset of S. 
There exists a smallest subsemigroup of S containing W. 


Proof Let H be the intersection of all subsemigroups of S containing W. 
By the previous theorem H is a subsemigroup and is contained in all other 
subsemigroups of S containing W. 














Definition 1.34 The smallest subsemigroup of S containing the nonempty set 
W is the semigroup generated by W. It is denoted by (W). 


The proof of the following theorem is left to the reader. 


Theorem 1.13 Let (S, x) be a semigroup and W be a nonempty subset of S. 
The set of all finite products of elements of W together with the elements of W 
is the semigroup generated by W. 


Definition 1.35 Let (M, x) be a monoid and W be a nonempty subset of M. 
The semigroup generated by W, together with the identity of M is called the 
monoid generated by W. It is denoted by W*. 
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Definition 1.36 A commutative semigroup (S, *) is a semilattice ifa x a = a 
for all a € S. An element a of a semigroup is called an idempotent element if 
a xa = a. A semilattice is therefore a commutative semigroup in which every 
element is an idempotent. If (S,*) is a semilattice and S £ S then S is a 
subsemilattice of S if x is a binary operation on S. Equivalently, (S, *) is a 
subsemilattice of (S, *) if S C S and for everya,b € S,a xb e S. 


Example 1.27 The semigroup consisting of all subsets of a fixed set T 
together with the operation N is a semilattice. 


Obviously lower semilattices and upper semilattices are semilattices. Con- 
versely given a semilattice (S, x), a partial ordering on S can be defined by 
s<tifs*t=t. 


Definition 1.37 If (S, x) is both a lower semilattice and an upper semilattice 
then it is called a lattice. If for any lattice (S, x) and any subset T of S, both 
the greatest lower bound and the least upper bound exist, then (S, x) is called 
a complete lattice. 


Definition 1.38 A group G is amonoid such that for every g € G, there exists 


g7! € G such that gg~! = g~!g = 1 where 1 is the identity of the monoid. 


If a semigroup (S,*) is infinite, then it is possible that the semigroup 
generated by {a} is infinite. It consists of the elements {a, a*,a>,...} where 
att! = a* xa. For example, if Z is the semigroup of integers under addition, 
then the semigroup generated by {2} is {2, 4, 6, 8, . . .}. Ifa semigroup (S, «) is 
finite, however, for some k and m, a* = a‘+”. Pick the smallest k and m, then 


k k+l „k+2 
Poari 


a*,a .. , a"! form a semigroup. If each element is multiplied by a* 


we again get a‘, aft! , ak+?,..., a"! so there is some a’ so that a‘ x a! = a‘. 
Therefore a! xaft} = a**+/ for all 0 < j < m — 1 anda’ is the identity of the 
semigroup.Therefore the semigroup is a monoid. Also for each a/ there exist 
a” such that a/ xa” = a" xat = a'. This element is called the inverse of a’. 


Hence this set forms a group. 


Definition 1.39 A function f from the semigroup (S, x) to the semigroup (T, e) 
is called a semigroup homomorphism if f(s x s’) = f(s)e f(s’) foralls, s' € 
S. If the semigroup f is one-to-one and onto, then f is called a semigroup 
isomorphism. A function f from the monoid (S, x) to the monoid (T, e) is 
called a monoid homomorphism if f(s « s') = f(s)e f(s’) for all s,s' € S 
and f maps the identity of S to the identity of T. If a monoid homomorphism 
f is one-to-one and onto, then f is called a monoid isomorphism. 
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Normally when a function is a homomorphism from a monoid to a monoid, 
we shall assume that it is a monoid homomorphism and simply call it a 
homomorphism. 


Example 1.28 Let f : (Z,+) — (Z, +) be defined by f(a) = 2a, then f is 
a semigroup homomorphism. It is also a monoid homomorphism. 
Example 1.29 Let 2Z be the set of even integers and f : (Z, +) > (2Z, +) 


be defined by f(a) = 2a, then f isa monoid homomorphism. It is also a monoid 
isomorphism. 


Example 1.30 Let S be the semigroup of n x n matrices with the operation 
multiplication, R be the semigroup of real numbers with the operation multipli- 
cation, and det(A) be the determinant of a matrix A. Then det : (S,-) > (R, -) 
is ahomomorphism. 


Example 1.31 Let R, denote the semigroup of positive real numbers with 
the operation multiplication and In be the natural logarithm, then In : (R4, -) > 
(R, +) is a homomorphism. 


The following theorem is left to the reader: 
Theorem 1.14 Let f : S —> T be a homomorphism, then 


(a) If S’ is a subsemigroup [submonoid] of S, then f(S') is a subsemigroup 
[submonoid] of T. 

(b) If T’ is a subsemigroup [submonoid] of T, then f~\(T’) is a subsemigroup 
[submonoid] of S. 


(c) If f : S > T is an isomorphism, then f7! : T — S is an isomorphism. 


Definition 1.40 A nonempty subset T of a semigroup S is a left ideal of S if 
s € S andt € T implies ts € T. A nonempty subset T of a semigroup S is a 
right ideal of S ifs € S and t € T implies st € T. A subset T of a semigroup 
S is an ideal of S if it is both a left ideal of S and a right ideal of S. 


Obviously an ideal of S is a subsemigroup of S. 
Example 1.32 Let S be the semigroup of 2 x 2 matrices with multiplication 


; i ; a 0 
as the operation and the integers as elements. Then matrices of the form | b o 


form a left ideal and matrices of the form E d form a right ideal. 


Example 1.33 The semigroup of even integers form an ideal of (Z, -). 


Definition 1.41 An equivalence relation R ona semigroup S is a congruence 
if foralla,b,c,d € S, aRb and cRd imply acRbd. 
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Definition 1.42 Let R be a congruence on a semigroup S. Let ar be the 
congruence class containing a. The set S/R of all the congruence classes with 
the multiplication ar - br = (a x bòr is called the quotient semigroup relative 
to the congruence R. 


Example 1.34 Let R, the set of real numbers, be a semigroup with the oper- 
ation addition [multiplication] and define aRb if a — b is a multiple of 5. 
Then [0], [1], [2], [3], and [4] form a semigroup with the operation addition 
[multiplication]. 


The proof of the following theorem is left to the reader. 


Theorem 1.15 Let R be a congruence on a semigroup S. Then S/R is a 
semigroup with the operation defined in the previous definition and pr : S > 
S/R defined by p(s) = SR is a homomorphism. 


Theorem 1.16 Let f : A— B be a homomorphism and R be the congru- 
ence aRa’ iff f(a) = f(a’), then there exists a homomorphism g : Afr > B 
defined by glar) = f(a). Hence g o br = f. 


[A] B 
Pr] vA 
Alp 


Proof We showed in Theorem 1.9 that g is a function. 


gR dp) = f(a-a') 
= f(a): f@’) 
= glar): glar). 














Theorem 1.17 Let f : A/r —> B be a function and S C R, so if aSa' 
implies aRa', then there exist functions g : A/s > B andi: A/s > A/r 
such that f oi = g. 


f 
Alg >B 


VA 


Proof Leti : A/s —> A/r be defined by i(as) = ar and g : A/s —> B by 
g(as) = f (ar). The function i is trivially well defined and a homomorphism. 
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The proof that g is a function is similar to the proof of Theorem 1.9. (See 
Theorem 1.10). 
glas - ag) = g(aas) 
= f(aag) 
= f(araR) 
= f(ar)f (ar) 
= g(as)g (aş). 














We already know that the set of all functions from a set A to itself form a 
semigroup since for a € A, and functions f, g, and h from A to itself, (( f o 
(eo h)a) =((f o g)o h)a) = f(e(h(a). Also since f, g, and h are relations 
we have already proven that (f o(goh)=(fog)oh. 

Conversely, given a semigroup S, and s € S we can define a function ¢, : 
S — S by ¢,(t) = st for all t € S. Let Ts = {ps : S — S for s € S}. For all 
s, t, and u in S, 


Qsi(u) = st (u) 
= ;(tu) = sQ, (u) 
= ($s 0 PJ) 


and Øst = (Qs o r). Lett : S — Tg be defined by t(s) = ¢,. The function t 
is a homomorphism since 


T(st) = bsi = $s 0 br = T ($) - T(t). 


Theorem 1.18 Every semigroup is isomorphic to a semigroup of functions 
from a set to itself with operation composition. If S is a monoid, then S is 
isomorphic to a monoid of functions from S to itself. 


Proof Given a semigroup S, and s € S define ¢! : S! > S! by l(t) = st 
for all t € S! and let 7J = {ġ! : S! > S! for s € S}. Let t! : S —> TÌ be 
defined by t! (s) = #!. Using the same argument as above, we see that t! is 
a homomorphism. But if ¢! = ¢}, since ¢1(1) = s1 = s and ọ}(1) = t1 =t 
then s = ¢ and t! is an isomorphism. The second part of the theorem follows 
immediately. 














Exercises 


(1) Prove Theorem 1.13. Let (S, x) be a semigroup and W be a nonempty 
subset of S. The set of all finite products of elements of W together with 
the elements of W is the subsemigroup generated by W. 
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(2) Prove Theorem 1.14. Let f : S — T be a homomorphism, then 

(a) If S’isasubsemigroup [submonoid] of S, then f(.S’) is asubsemigroup 
[submonoid] of T. 

(b) If T’ is a subsemigroup [submonoid] of T, then f—'(T’) is a subsemi- 
group [submonoid] of S. 

(c) If f : S— T isan isomorphism, then f7! : T —> S is an isomor- 
phism. 

(3) Prove Theorem 1.15. Let R be a congruence on a semigroup S. Then 
S/R is a semigroup with the operation defined by ar - br = (a x b)p and 
$r : S > S/R is a homomorphism. 

(4) Prove that a finite semigroup S contains a subgroup. 

(5) Give an example of a group which contains a subsemigroup that is not a 
monoid. 

(6) Prove that the identity of a monoid is unique. 

(7) Prove that if a semigroup contains a 0, then it is unique. 

(8) Prove that if G is a finite group and H is a subgroup of G, then |H | = |g H| 
for every g € G. 

(9) Prove that if G is a finite group and H is a subgroup of G, then H = gH 
if and only if g € H. 

(10) Prove that if G is a finite group and H is a subgroup of G, then | H | divides 
IG]. 

(11) An idempotent of a semigroup S is an element a such thata - a = a. Prove 
that if f : S —> T is a homomorphism, then if a is an idempotent, f(a) 
is an idempotent. 

(12) An element a of a semigroup S is a left identity if as = s for all s € S. 
An element a of a semigroup S is a right identity if sa = s for all s € S. 
Give an example of a semigroup having more that one left identity. 

(13) Let f : S —> T be a homomorphism. Prove that if T contains 0, then 
f1 (0) is an ideal. 

(14) Using the properties in Theorem 1.1, prove that if S = P(C) for some 
nonempty set C, then (S, A) is a monoid, where A denotes the symmetric 
difference. What is the identity? Is (S, A) a group? 

(15) Let S = P(C) for some nonempty set C, is (S, U) a monoid? Is (S, N) a 
monoid? Are they groups? 

(16) Define the multiplication of permutations of a set to be composition as 
shown in the previous section. Prove that the set of permutations of a set 
with this multiplication form a group. 
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Languages and codes 


2.1 Regular languages 


Definition 2.1 An alphabet, denoted by ©, is a set of symbols. A string or 
word is a sequence ajaza344 ... an where a; € È. 


Thus if & = {a, b}, then aab, a, baba, bbbbb, and baaaaa would all be 
strings of symbols of X. In addition we include an empty string denoted by À 
which has no symbols in it. 


Definition 2.2 Let &* denote the set of all strings of È including the empty 
string. Define the binary operation o called concatenation on &* as follows: 
Tf ayaya3a4...d, and bibzb3b4 ... bm € X* then 


a{a20304...ay, O by bob3b4 eres Din = 41a203a4.. . anbıb2b3b4 ee Din. 


If S and T are subsets of &* then So T ={sot:s € S,t € T}. ThesetSoT 
is often denoted as ST. 


Thus if & = {a, b}, then aabba o babaa = aabbababaa. In particular, if œ 
is a string in &* then À o w = w o À =a, so that a string followed or preceded 
by the empty string simply gives the original string. Notice that in general it is 
not true thatwov=vow. 

The following is a specific case of the submonoid generated by a subset of 
a monoid described in Chapter 1. 


Definition 2.3 Let B be a subset of &* then B* is the set of all strings or 
words formed by concatenating words from B together with the empty string, i.e. 
B* = {wiw2... Wn : wi E B}U {A}. IFØ denotes the empty set then Ø* = {i}. 


The symbol * is called the Kleene star and is named after the mathematician 
and logician Stephen Cole Kleene. 
Note that &* is consistent with this definition. 
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Example 2.1 {a}* = {A, a, aa, aaa, ...}. 
Example 2.2 {a}{ab}*{c} = {ac, aabc, aababc, ...}. 


Let At be the set consisting of all finite products of elements of a nonempty 
set A together with the operation of concatenation. The set At = (A), and 
hence is a semigroup as shown in Theorem 1.13. From Theorem 1.13 and the 
definition of A* we know that if A is a nonempty subset of © then the set 
(A*, 0) is a monoid where à is the identity. It is the submonoid generated 
by A. If A does not contain the empty word, then (A*, o) differs from (A*, o) 
since it contains the empty word. Thus if A = {a}, then At = {a, a’,a?,...} 
and A* = {A,a,a’,a°,...}. Note that At = a A*. 


Definition 2.4 Let &* denote the set of all strings of & including the empty 
string. A subset L of &* is called a language. 


If & is the set of letters in the English alphabet, then L could be the set of 
words in the English language. If X is the set of letters in the Greek alphabet, 
then L could be the set of words in the Greek language. If X is the set of symbols 
used in a computer language, then L could be the set of words in that language. 
Since every subset of X* is a language, many will be difficult or impossible to 
describe. In particular a language is not necessarily closed under the operation 
of concatenation. 

If & is the set {a, b, c} then the following are languages: 


Lı = {a, aab, aaabb, aaaabbb...}, 

L = {w : w € &* and contains exactly one a and one b}, 

L3 = {w : w € &* and contains exactly two bs}, 

L4 = {w : w € &* and contains at least two bs}, 

Ls = {w : w € ©* and contains the same number of as, bs, and cs}, 
Ls = {w : w = a”b” forn > 1}, 

L = {w : w = a”b”c" forn > 1}, 

Lg = {w : w € &* and contains no cs}. 


Definition 2.5 Let X be an alphabet. The class of regular expressions R over 
È is defined by the following rules using & and the symbols Ø, à,* , V , (,and). 
The symbol 2 is used to denote the symbol Ø*. 


(i) The symbol Ø is a regular expression and for every a € &, the symbol a is 
a regular expression. 
(ii) If wı and w are regular expressions, then w,W2, W1 V w2, wi, and (w1) 
are regular expressions. 
(iii) There are no regular expressions which are not generated by (i) and (ii). 
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Each expression corresponds to a set with the following correspondence £ 
defined by 


£(0) =ð, 
£(a) = {a} foralla € E, 
£A) = {A}, 


£(w1 V w2) = £(w1) U £(w2) for expressions w1, w2, 
£(wiw2) = £(w1) o £(w2), 


£(w7) = £(w1)*, 
so that 
£(aa*) = {a} o {a}* = {a, aa, aaa, aaaa, aaaa,...}, 
£(a(b v c)d) = {a} o {{b} U {c} o {d} = {abd, acd}, 
£((a v b)*) = {a U b} = {h, a, b, ab, ba, abb, aba, ...} 
= all strings consisting of 
0 or more as and bs, 
£(ab*c) = {a} o {b}* o {c} = {ac, abc, abbc, abbbc, ...}, 
£(a* v b* v œ) = {a}* U {b}* U {c} = {à,a,b,c,..., af, bk, c.. 
£(A) = £(0*) = {A}, 
£((a v b)e)) = ({a} U {b} o {c} = {ac, bc}. 


The image of a regular expression is a regular language. Regular languages 
may be defined as follows: 


Definition 2.6 The class R of a regular languages over & has the following 
properties: 


(i) The empty set, Ø € R, and ifa € È, then {a} € R. 
(ii) If sı and sz € R, then sı U s2, s1 0 s2, sp E€ R. 
(iii) Only sets formed using (i) and (ii) belong to R. 


Although it will not be shown until later, the intersection of regular sets is a 
regular set and the complement of a regular set is a regular set. 

The previous definitions of regular languages and regular expressions are 
examples of recursive definitions. In a recursive definition there are three 
steps. (1) Certain objects are defined to be in the set. (2) Rules are listed for 
producing new objects in the set from other objects already in the set. (3) There 
are no other elements in the set. Mathematical induction is a special case of a 
recursive definition. We shall see that the set {ab, ab*,...,a"b", ...} is not 
a regular set. However, we cannot assume this is the case because we cannot 
immediately describe the set using the definition. In general it is not always 
easy to show that a set is not regular. Later, we shall show how to determine 
that many sets are not regular. 
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Example 2.3 Examples of regular expressions include (a v b)*, (a* v b *), 
a“ (c v d)a (b v av c)*, and à. Examples of regular sets include {a, b, c}, 
{a}*, {ab}*, {c}{b}*, {a} v {b} v {cd}, and ({a} v {b})* v {c*d} v {A}. 


As mentioned previously, not all classes of languages are so easily defined. 
In the following chapters we shall define machines that generate languages 
and machines that accept languages. A machine accepts a language if it can 
determine whether a string is in the language. Many languages are defined by 
the fact that they can be generated or accepted by a particular type of machine. 

If T* = S, T is not usually uniquely defined. If T = {a,b,c,d}, and 
T = {a,b,c,d, ab, cd, bc}, then T* = T but, while every string in T* can 
be expressed uniquely as the concatenation of elements of T, this is not true of 
elements of T since the expression abcd can be expressed as (a)(b)(c)(d), and 
also as (a)(b)(cd), (a)(be)(d), (ab)(cd), etc. 


Definition 2.7 A code is a subset of X*. If C is a subset of X* and every string 
in S can be expressed as the concatenation of elements of C, then we say that 
C is a code for S. A code C is uniquely decipherable if every string in S can 
be uniquely expressed as the concatenation of elements of C. 


Therefore {ba, ab, ca}, {ade, ddbee, dfc,dgd}, and {ae,b,c,de} are 
uniquely decipherable codes while {a, ab, bc,c}, {ab, abc, cde, de}, and 
{a, bc, ab, c} are not uniquely decipherable codes. 

Note that in many texts, a subset of &* is defined to be a code only if it is 
uniquely decipherable. 


Definition 2.8 Let £ be an alphabet. A nonempty code C C &* is called a 
prefix code if for all words u,v € C, if u = vw for w € &*, then u = v and 
w = À. This means that no word in a code can be the beginning string of another 
word in the code. A nonempty code C C &X* is called a suffix code if for all 
words u,v € C, if u = wv for w € &*, then u = v and w = i. This means 
that no word in a code can be the final string of another word in the code. A 
nonempty code C C &X* is called a biprefix code if it is both a prefix and a suffix 
code. A nonempty code C C &* is called an infix code if no word in the code 
can be a substring of another word in the code so that if u and wuv are words 
in the code for w, v € X*, then w = v =i. A code is called a block code if 
each string in the code has equal length. 


The set {a, ab, abc} is a uniquely decipherable code but it is not a prefix 
code since a is the initial string of both ab and abc and ab is an initial string of 
abc. It is however a suffix code. The set {a, ba, ca} is a prefix code, but it is not 
a suffix code since a is the final string of both ba and ca. The set {ad, ab, ac} 
is a biprefix code. Any code whose regular expression begins with a* is not a 
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suffix code and any code whose regular expression ends with a* is not a prefix 
code. However, a*b is a prefix code and ab* is a suffix code. 
The proofs of the following theorems are left to the reader: 


Theorem 2.1 Jfa code is a suffix, prefix, infix, biprefix, or block code, then it 
is uniquely decipherable. 


Theorem 2.2 A block code is a suffix, prefix, infix, and biprefix code. 


Theorem 2.3 An infix code is a biprefix code. 


Exercises 


(1) Given w = 10110, find five words v1, v2, v3, v4, vs such that viw = wo; 
for 1 <i <5. 
(2) Find regular sets corresponding to the following expressions. If the set is 
infinite, list ten elements in the set: 
(a) a(b v c vd)a 
(b) a*b*c 
(c) (a v b)(e v d) 
(d) (ab*A) v (cd)* 
(e) a(be)*d. 
(3) Find regular sets corresponding to the following expressions. If the set is 
infinite, list ten elements in the set: 
(a) be(be)* 
(b) (a Vb* v A)(cv d*) 
(c) (a v be v d)* 
(d) (a v b)(c v d)b 
(e) a* (b v c v d)*. 
(4) Find regular expressions that correspond to the following regular sets: 
(a) {ab, ac, ad} 
(b) {ab, ac, bb, bc} 
(c) {a, ab, abb, abbb, abbbb, ...} 
(d) {ab, abab, ababab, abababab, ababababab, ...} 
(e) {ab, abb, aab, aabb}. 
(5) Find regular expressions that correspond to the following regular sets: 
(a) {ab, acb, adb} 
(b) {ab, abb, abbb, abbbb, ...} 
(c) {ad, ae, af, bd, be, bf, cd, ce, cf} 
(d) {abcd, abcbcd, abcbcbcd, abcbcbcbcd, ...} 
(e) {abcd, abef, cdcd, cdef}. 
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(6) Let & = {a, b, c}. 

(a) Give a regular expression for the set of all elements of &* containing 
exactly two bs 

(b) Give a regular expression for the set of all elements of &* containing 
exactly two bs and two cs 

(c) Give a regular expression for the set of all elements of £* containing 
two or more bs 

(d) Give a regular expression for the set of all elements of &* beginning 
and ending with a and containing at least one b and one c. 

(e) Give a regular expression for the set of all elements of &* consisting 
of one or more as, followed by one or more bs and then one or more 
cs. 

(7) Let È = {a, b}. 

(a) Give a regular expression for the set of all elements of &* containing 
exactly two bs or exactly two as. 

(b) Give a regular expression for the set of all elements of &* containing 
an even number of bs. 

(c) Give a regular expression for the set of all elements of &* beginning 
and ending with a and containing at least one b. 

(d) Give a regular expression for the set of all elements of &* such that 
the number of as in each string is divisible by 3 or the number of bs 
is divisible by 5. 

(e) Give a regular expression for the set of all elements of &* such that 
the length of each string is divisible by 3. 

(8) Which of the following are uniquely decipherable codes? 

(a) {ab, ba, a, b} 

(b) {ab, acb, accb, acccb, ...} 

(c) {a, b, c, bd} 

(d) {ab, ba, a} 

(e) {a, ab, ac, ad}. 

(9) Which of the following expressions describe uniquely decipherable codes? 

(a) ab* 

(b) ab*v baaa 

(c) ab*c v baaac 

(d) (a v b)(b v a) 

(e) (@VbVA)(b V av A). 

(10) Which of the following are uniquely decipherable codes? Which are suffix 
codes? 

(a) {ab, ba} 

(b) {ab, abc, bc} 
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(c) {a, b, c, bd} 
(d) {aba, ba, c} 
(e) {ab, acb, accb, acccb}. 

(11) Which of the following expressions describe prefix codes? Which describe 
suffix codes? 

(a) ab* 

(b) ab*c 

(c) a*be* 

(d) (av b)(bV a) 
(e) a*b. 

(12) Show that the intersection of the monoids {ac, bc, d}* and {a, cb, cd}* 
is the monoid generated by the code described by the expression 
ac(be)*d. 

(13) Prove Theorem 2.1. If a code is a suffix, prefix, infix, biprefix, or block 
code, then it is uniquely decipherable. 

(14) Prove Theorem 2.2. A block code is a suffix, prefix, infix, and biprefix 
code. 

(15) Prove Theorem 2.3. An infix code is a biprefix code. 


2.2 Retracts (Optional) 


In this section we discuss an additional source of examples of regular lan- 
guages: the fixed languages of endomorphisms of free monoids A*. Each such 
language is necessarily a submonoid of A* and is the image of a special type of 
endomorphism called a retraction. Such images are called retracts and they are 
characterized among submonoids as those submonoids that are generated by a 
special class of codes called key codes. 


Definition 2.9 Let X be a set and let f : X — X be a function having the 
property that f(f(x)) = f(x) for all x in X. A function with this property is 
called a retraction of X and its image is called a retract of X. 


Notice that the restriction of a retraction f : X —> X to the image of f, 
{ f(x) : x € X}, is the identity mapping of the image of f onto itself. 


Example 2.4 For the real numbers R, the absolute value function f : R > R, 
defined by f(r) = |r|, is a retraction and {x € R : x>0} is its associated retract. 
The floor and ceiling functions, when regarded as functions from R into R 
provide two additional examples of retractions which determine the same retract 
{r € R :r is an integer}. 
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Notice that, when X is merely a set having no specified structure, every 
nonempty subset S of X is a retract of S since S is the image of the retraction 
r:X — X defined by r(x) = x if x is in S and otherwise r(x) = s where s 
is in S. When structures are specified on a set, its retracts may become quite 
interesting. In this section we study the retracts of free monoids A*, where A 
is a finite set. Several of our results hold also when A is infinite but we leave 
these extensions to the interested reader. We will assume in this section and the 
next that alphabets are always finite. 


Definition 2.10 The fixed language of a homomorphism h : A* — A* is the 
set L = {w € A* : h(w) = w}. 


Note that the fixed language of each homomorphism is a submonoid of A*. 


Example 2.5 Let A = {a, b, c, d} and let f be the homomorphism f : A* > 
A* defined by f(a) = dad, f(b) = bc, f(c)=d, f(d) =X. The fixed lan- 
guage of f is the submonoid generated by the set {dad, bcd}. Notice that 
SFO) = flbc) = f(b) f(c) = bcd which is not equal to f(b) = bc. Con- 
sequently f is not a retraction. However, notice also that the homomorphism 
r : A* + A*, defined by r(a) = dad, r(b) = bcd, r(c) = r(d) = A is a retrac- 
tion and has the same image as f. Thus the image of f is a retract even though f 
itself is not a retraction. Finally, note that {dad, bcd} is a uniquely decipherable 
code. 


The behavior of a, b, and c in Example 2.5 will provide an illustration of the 
classification of alphabetical symbols that will be necessary for understanding 
retracts. 


Definition 2.11 Let A be a set and let f be a homomorphism f : A* —> A*. 
A symbol a in A is said to be mortal, with respect to f, if there is a positive 
integer n for which f"(a) = à; otherwise a is said to be vital. 


For each homomorphism f, the mortal/vital dichotomy of the symbols of 
A may be determined as follows. For each nonnegative integer j let A; be 
defined inductively by: Ap is empty; Aj = {a € A: h(a) =A}; and for j > 2, 
Aj={aeA:h(aje Aï} Since A is finite there will be a least nonnegative 
integer m for which Am = Am+1. The set A,, is the set of all mortal symbols 
and its complement in A is the set of vital symbols. 

Notice that in Example 2.5 the symbols d and c are mortal and the symbols a 
and b are vital. Note also that the fixed language is the submonoid generated by 
a set of words in each of which there is exactly one occurrence of a vital symbol. 
Further, each of these vital symbols occurs in only one of the generators. In 
this section we show that the simple observations concerning fixed languages, 
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retractions, and codes, made for Example 2.5, are completely typical of fixed 
languages of homomorphisms. 


Definition 2.12 Any symbol k in A that occurs exactly once in a word w in 
A* is called a key of w. A word w for which there is at least one key symbol is 
called a key word. 


Note that for the word dad the symbol a is the unique key. For the word bcd 
each of the three symbols b, c, and d is a key. Consequently both dad and bcd 
are key words. Finally, the word abcbac is not a key word since it has no key. 


Definition 2.13 A set X of words is called a key code if each word in X is a 
key word and a key for each word in X can be chosen that does not occur in 
any other word in X. 


Note that the set of generators given in Example 2.5, namely X = 
{dad, bcd}, is a key code. The word dad allows only the unique symbol a 
to be chosen as its key. The word bcd allows each of b, c, and d to be chosen 
as a key. To confirm that X is a key code we cannot use d as a key for the word, 
but if either b or c is chosen as the key for bcd, then it is confirmed that X is a 
key code. Each key code X is uniquely decipherable since, given any string that 
is the concatenation of words chosen from X, simply noting the key symbols 
that occur in the string provides the unique segmentation of X into code words. 
The key codes constitute a very restricted subclass of the uniquely decipher- 
able codes. Such simple codes as {aa, bb, cc, dd} are uniquely decipherable, 
but contain no key word at all. 


Example 2.6 Let A = {a, b,c, d}. The following are key codes: {a, b, c, d}, 
{a, bcc, dcc}, {abcbc, bbd}, {ababcd}, and the empty set. Note the crucial fact 
that {a, b, c, d} is the only key code in A* that consists of exactly four words, 
since each four-word key code must use each of the four symbols in A as 
a key. The following are not key codes: {abba}, {abcd, c}, {abc, bcd, cda}, 
{a, b, c, dd}. Note that a subset of A* that contains five or more words cannot 
be a key code, since there are only four possible keys. 


The following technical result is the basis for the theorem that establishes 
the firm relationships that hold among the concepts: fixed language, retract, and 
key code. 

The following proposition was discovered by Tom Head [16]. 


Proposition 2.1 Let A be analphabet andh : A* — A* be a homomorphism. 
Let X = {a € A: h(a) = uav where only mortal symbols occur in u and v}. 
For eacha in X, let N, be the least nonnegative integer for which hN«(uv) = A. 
Let H = {h™: (a) : a € X}. The fixed language L of h is the submonoid of A* 
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generated by H. The correspondence a<>h™«(a) is a one-to-one correspon- 
dence between X and H. 


Proof (1) H* C L: Since h is a homomorphism it is sufficient to verify that 
each element h(a) of H is in L, which is confirmed by the calculation: 


AhN(a)) = hN(h(a)) 
= hN@(uav) 
= WO wh (ayhNO(v) 
= AMN@(a)r 
= hN@ca). 


(2) L C H*: Let w bein L. Let a, a2, a3, ..., an be the subsequence consisting 
of the occurrences of the vital elements of w. We use now the principle: a 
vital symbol can come only from a vital symbol and only a mortal symbol 
can eventually be erased. Since h(w) = w, each a; must occur exactly once 
in h(q;). It follows that, for each a;, we must have h(a;) = u;a;v; where each 
ui, vi Must consist entirely of mortal symbols. Thus for each i there is a least 
nonnegative integer N(i) for which h%(u;v;) = à. Let N be the largest of 
the N(i). Note that, for each a;, h™ (ai) = hN©(a;) is in H. Then w = h(w) = 
h (w) = h” (ay)h¥ (anh (a3)... h™ (an) is in H*. 

From (1) and (2) we have L = H*. Note that H = {hN(a): a € X} is a 
key code since the set X is a set of keys for H. 














Theorem 2.4 Let A be a finite alphabet and L be a language contained in 
A*. The following three conditions on L are equivalent: 


(1) L is the fixed language of a homomorphism of A* into A*; 
(2) L is a submonoid of A* that is generated by a key code; and 
(3) L is a retract of A*. 


Proof (1 = 2): Let h : A* > A* be a homomorphism. Proposition 2.1 pro- 
vides us with the key code H for which L = H*. 

(2 > 3): Let L be a submonoid of A* that is generated by a keycode X and 
let K be a set of keys for X. For each k in K there are strings x, and y% for 
which x,ky, is the key word in X having k as its key. Define a homomorphism 
r : A* —> A* by r(k) = xyky, for each k in K and r(a) = A for each a not in 
K. Note that r is a retraction of A* having X* = L as its image, hence L is a 
retract. 

(3 => 1): Let L be a retract of A*. Then there is a retraction r : A* 
— A* that has L as its image. Then L is the fixed language of the the 
homomorphism r. 
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Theorem 2.4 has several valuable corollaries the proofs of which will be 
relegated to the exercises. 


Corollary 2.1 A retract of a free monoid is free. 


Note that this property of retracts does not hold for arbitrary submonoids of 
a free monoid since, for any nonempty alphabet A, the submonoid consisting 
of all words of length >2 is not free. 


Corollary 2.2 IfA is an alphabet having exactly n symbols, then no inclusion 
chain of distinct retracts of A* has more than n + 1 retracts even when the 
retract {A} is included. 


Corollary 2.3 If X is a key code and x" lies in X*, then so does x. 


Corollary 2.4 If X is a key code and both uv and vu lie in X*, then so do u 
and v. 


Let A = {a1, do, a3, . . . , An}. A simple example of a longest possible inclu- 
sion chain of retracts in A* is 


{a1, a2, a3, ..., Qn}*, {az, a3, ..., an Y”, {a3,..., dn}*,..., {an}, {A}. 


Each of these retracts, except the first, is maximal among the retracts contained 
in its predecessor. In each case the number of generators of the subretract is 
one less than the number of generators of its predecessor. However, maximal 
proper subretracts of a retract can have many fewer generators: 


Proposition 2.2 Let n be a positive integer and A = {a;, a2, a3, . . . , an} an 
alphabet of n symbols. Let m be any positive integer less than n. Then A* 
contains a maximal proper retract generated by exactly m words. 


Proof The set of m words 


_ 272) 2 2 
K= fai, a2, a3, Mw As hp 4A (hy 1305 


2 
Gy —1 4n 4m An +1 4m 4-243 + + + an—1an} 


is a key code for which K* is a maximal proper retract of A*. 
The verification of the maximality is left as an exercise. 














The retracts of a free monoid and the the partially ordered set they form 
under set inclusion have been studied previously in [16],[10], and [9]. 
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Exercises 


(1) Which of the following are sets of key codes? 

(a) {a, ab, ac, d} 

(b) {ab, ac, ad, ae} 

(c) {aabaa, aacaa, ddeda, dadaf } 
(d) {abba, acca, adda, aeea}. 

(2) Define the retraction maps with the following retracts on A* where A = 
{a, b,c, d, x, y, z} 

(a) {aabaa, acaax, daaxy} 
(b) {ax, bx, cx, dx} 
(c) {abcd}. 

(3) Prove that the restriction of a retraction f : X —> X to the image of f, is 
the identity mapping of the image of f onto itself. 

(4) Prove Corollary 2.1 that a retract of a free monoid is free. 

(5) Prove Corollary 2.2 — if A is an alphabet having exactly n symbols, then no 
inclusion chain of distinct retracts of A* has more than n + 1 retracts even 
when the retract {A} is included. 

(6) Prove Corollary 2.3 — if X is a key code and x” lies in X*, then so does x. 

(7) Prove Corollary 2.4 — if X is a key code and both uv and vu lie in X*, then 
so do u and v. 


2.3 Semiretracts and lattices (Optional) 


The intersection of two retracts of the free monoid on a finite set A need not be 
a retract if A contains four or more symbols. Possibly the simplest example is 
the following one adapted from [7]: Let A = {a, b, c, d}. The sets {ab, ac, d} 
and {ba, c, da} are key codes and consequently the submonoids R and R’ that 
they generated are retracts of A*. However, their intersection 


ROR’ = (d(ab ac) 


is not only not a retract; it is not even finitely generated. The set d(ab)*ac is a 
uniquely decipherable code since simply noting the locations of the occurrences 
of the symbol d in any string that is a concatenation of these words provides 
the unique segmentation of the string into generators. Thus R N R’ is a free 
submonoid, although not a retract of A*. In fact, the intersection of any family, 
whether finite or infinite, of free submonoids of a free monoid is free [37]. 
Consequently the family of free submonoids of a free monoid is always a 
complete lattice. By broadening our attention slightly we obtain a similarly 
attractive stability result for what we call semiretracts of free monoids: 
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Definition 2.14 By a semiretract of A*, we mean an intersection of a finite 
number of retracts of A*. 


Each retract of A* is also a semiretract. The clearest example of a semiretract 
that is not a retract is the example given previously: R N R’ = (d(ab)*ac)*. 
Some pairs of retracts have as their intersection a retract: 


{abc, d}* N {a, bced}* = {abcd}*. 


As stated above, but not to be demonstrated here, if fewer than four alpha- 
bet symbols appear in the keycodes that generate a class of retracts, then the 
intersection of this collection must also be a retract. 

Since every retract of A* is a regular language, every semiretract is also 
a regular language. Thus this section continues to yield examples of regular 
languages. 

The definition of a semiretract provides one closure property, “the inter- 
section of a finite number of semiretracts is semiretract.” A stronger result is 
true, but not obvious, “the intersection of any finite or infinite collection of 
semiretracts of a free monoid A* on a finite alphabet A is again a semiretract.” 
This is an immediate consequence of a co-compactness property of the family 
of semiretracts of A* which is included in the appendices. Every collection of 
retracts of A* has a finite sub-collection that has the same intersection as the 
original collection. The intersection of the finite sub-collection is a semiretract 
of A* by the definition of semiretract. 

The elementary set theoretic union of two semiretracts need not be a semire- 
tract nor even a submonoid: a* and b* are retracts, but a* U b* is not a sub- 
monoid. However, given any collection C of semiretracts, whether finite or 
infinite, let M be the intersection of the class of all semiretracts of A*, each 
of which contains every semiretract in C. There is at least one semiretract that 
contains them all, namely A* itself. The resulting intersection M is a semiretract 
of A* as explained in the previous paragraph. For each such C we denote M by 
VC. We summarize the discussions of this section as: 


Theorem 2.5 Let A be a finite alphabet. The set of all semiretracts of A* is a 
complete lattice with binary operations N and V having A* as maximal element 
and {i} as minimal element. 


The semiretracts of a free monoid and the lattice they form have been studied 
previously in [1], [2], and [3]. 
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Exercises 


(1) Find the code of the semiretract which is the intersection of retracts with 
key codes {ab, cb, cd} and {a, bc, d}. 

(2) Find the code of the semiretract which is the intersection of retracts with 
key codes {ab, st, sd, ef, eg} and {a, bs, ts, de, fe, g}. 

(3) Find the code of the semiretract which is the intersection of retracts with 
key codes {ba, st, sd, ef, eg} and {as, bs, ts, de, fe, g}. 

(4) Find the key codes of two retracts whose intersection has the basis 
ab(def)*dgh. 

(5) Find the key codes of two retracts whose intersection has the basis 
ab(de)*df g(hk)*hm. 

(6) Prove that a key code of a retract is a prefix code. 

(7) Prove that a key code of a retract is an infix code. 

(8) Find a semiretract that is the intersection of three retracts, but not two 
retracts. 


3 


Automata 


3.1 Deterministic and nondeterministic automata 


An automaton is a device which recognizes or accepts certain elements of X*, 
where » is a finite alphabet. Since the elements accepted by the automaton are 
a subset of &*, they form a language. Therefore each automaton will recognize 
or accept a language contained in X*. The language of &* consisting of the 
words accepted by an automaton M is the language over &* accepted by M 
and denoted M(L). We will be interested in the types of language an automaton 
accepts. 


Definition 3.1 A deterministic automaton, denoted by (£, Q, so, Y, F), con- 
sists of a finite alphabet ©, a finite set Q of states, and a function Y : Q x X —> 
Q, called the transition function and a set F of acceptance states. The set Q 
contains an element so and a subset F, the set of acceptance states. 


The input of Y is a letter of & and a state belonging to Q. The output is a 
state of Q (possibly the same one). If the automaton is in state s and “reads” 
the letter a, then (s, a) is the input for Y and Y(s, a) is the next state. Given a 
string in X* the automaton “reads” the string or word as follows. Beginning at 
the initial state sọ, and beginning with the first letter in the string (if the string is 
nonempty), it reads the first letter of the string. If the first letter is the letter a of 
È, then it “moves” to states = Y(so, a). The automaton next “reads” the second 
letter of the string, say b, and then moves to state s’ = Y(s, b). Therefore, as the 
automaton continues to “read” a string of letters from the alphabet it “moves” 
from one state to another. Eventually the automaton “reads” every letter in the 
string and then stops. If the state the automaton is in after reading the last letter 
belongs to the set of acceptance states, then the automaton accepts the string. 
Let M be the automaton with alphabet & = {a, b}, set of states Q = {50, 51, S2}, 
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and Y defined by the table 


So Sy S2 


Si 92 ë N 
So So Sy 


basia 





Suppose M “reads” the string aba. Since the automaton begins in state sọ, and 
the letter read is a, and Y (sọ, a) = s1, the automaton is now in state sı. The 
next letter read is b and Y (s1, b) = so. Finally the last letter a is read and, since 
Y(so, a) = sı, the automaton remains in state sı. We may also state Y as a set 
of rules as follows: 


If in state sọ and a is read go to state s1. 
If in state sı and a is read go to state s2. 
If in state sz and a is read go to state s2. 
If in state so and b is read go to state so. 
If in state sı and b is read go to state so. 
If in state sz and b is read go to state sı. 


Let so and sz be the acceptance states. 

This deterministic automaton is best shown pictorially by a state diagram 
which is a directed graph where the states are represented by the vertices and 
each edge from s to s’ is labeled with a letter, say a, of the alphabet & if 
Y(s, a) = s’. A directed arrow from s to s’ labeled with the letter a will be 
called an a-arrow from s to s’. If s is a starting state, then its vertex is denoted 


by the diagram 
=) 


If s is an acceptance state, its vertex is denoted by the diagram 


Therefore the deterministic automaton above may be represented pictorially 
as seen in Fig. 3.1. More specifically, an automaton “reads” a word or string 
aga \d...d, of &* by first reading ag, then reading a, and continuing until it 
has read a,,. If an automaton is in state sı and reads the word w and is then 
in s2, then w is a path from sı to s2. A deterministic automaton accepts or 
recognizes doa\d2...d, if after beginning with ao in state sọ and continuing 
until reading a, the automaton stops in an acceptance state. Thus the automaton 
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st 
a a 
G) 
b b 


Figure 3.1 


above would not accept aba since s; is not an acceptance state. It would however 
accept bbaaa and bab, since so and sz are acceptance states. 
The automaton with the state diagram 


g Oe 


OLLIE 


has initial state sọ and acceptance state s3. It accepts the word aba since after 
reading a, it is in state sı. After reading b, it is still in state sı. After reading 
the second a, it is in state s3, which is an acceptance state. One can see that 
it also accepts abbba and bb, so they are in the language accepted by the 
given automaton. However bbb, abab, and abb are not. Notice that any string 
beginning with two as or two bs is accepted only if the string is not extended. 
Also, if three as occur in the string, the string is not accepted. The state s4 is an 
example of a sink state. Once the automaton is in the sink state, it can never 
leave this state again, regardless of the letter read. 

Since Y is a function, a deterministic automaton can always read the entire 
string. We shall later define a nondeterministic automaton which may not always 
be able to read the entire string. In such a case the word cannot be accepted. 


Example 3.1 Consider the automaton with state diagram 


Oe) a 
O O 





a 
a 


having & = {a, b}, starting state so, and acceptance states so, s1, and s2. It 
obviously accepts the word bb. In each state, there is a loop for a so that 
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if a is read then the state does not change. This enables us to read as many 
as as desired without changing states, before reading another b. Thus the 
automaton reads aababaaa, baabab, baaab, babaaa, aabaabaa, and in fact 
we can read any word in the language described by the regular expression 
(a*ba*ba*) V (a*ba*) V a*. This language can also be described as the set of all 
words containing at most two bs. Notice that s3 is a sink state. 


Example 3.2 Consider the automaton with state diagram 


r AN 
“EO 


which we simplify as 


S) 
a ~a 
-0+0 


Se «ihc 


a,b,c 


to decrease the number of arrows. This automaton obviously accepts only 
the words ab and ac. This language may be described by the regular expres- 
sion a(b v c). Notice that the sink state s2 eliminates all other words from the 
language. 


Example 3.3 Consider the automaton with state diagram 


EEC 
ba hl 


a,b,c 





The only words accepted are b and abc. Therefore the expression for the lan- 
guage accepted is b V abc. 
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Example 3.4 Consider the automaton with state diagram 


a ab 

F Q 

-©2.©*@©*O 
ao 


In this automaton, if three consecutive bs are read, then the automaton is in state 
53, Which is a sink state and is not an acceptance state. This is the only way to 
get to s3 and every other state is an acceptance state. Thus the language accepted 
by this automaton consists of all words which do not have three consecutive bs. 
An expression for this language is 


(a v (ba) v (bba))*(A V b v (bb)). 


As previously mentioned, the automata that we have been discussing are 
called deterministic automata since in every state and for every value of the 
alphabet that is read, there is one and only one state in which the automata can 
be. In other words, Y : Q x & — Q is a function. It is often convenient to relax 
the rules so that Y is no longer a function, but a relation. If we again consider 
Y as a set of rules, given a € & and s € Q, the rules may allow advancement 
to each of several states or there may not be a rule which does not allow it to 
go to any state after reading a in state s. In the latter case, the automaton is 
“hung up” and can proceed no further. This cannot occur with a deterministic 
automaton. 

Although the definition of a nondeterministic automaton varies, we shall use 
the following definition: 


Definition 3.2 A nondeterministic automaton, denoted by 
(2, Q, so, T, F) 
consists of a finite alphabet &, a finite set Q of states, and a function 
Y:QxE— PQ) 


called the transition function. The set Q contains an element so and a subset 
F containing one or more acceptance states. (Note that P(Q) is the power set 


of Q.) 


Thus given a € & and s € Q, there may be a-arrows from s to several dif- 
ferent states or to no state at all. By definition, a deterministic automaton is also 


42 Automata 


considered to be a nondeterministic automaton. A nondeterministic automaton 
often simplifies the state diagram and eliminates the need for a sink state. In 
Example 3.2, the state diagram can be simplified to 


-SFO 


Note that in reading aa, after reading the first a, the automaton is in state s3, 
and when the second a is read the automaton “hangs up”, since there is no a 
arrow out of state s3. 


Example 3.5 The deterministic automaton represented by 
-i mur ua 


EERO 


can be simplified using a nondeterministic automaton by simply eliminating 
state s4 and all arrows into or out of this state. 


Example 3.6 Itis easily seen that the automaton with state diagram 


Qe 
OHOO) 


accepts the language with regular expression ab*c. 





Example 3.7 The automaton with state diagram 


© 
=W 


accepts the language with regular expression a V b. 
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Example 3.8 The automaton with state diagram 
() a Q b 


-©*+O46) 


accepts the language with regular expression aa*bb*. 





Example 3.9 The automaton with state diagram 


QO” a,b 
a,b a ja 
a,(%1)-+(() 
2O* 


accepts the language consisting of strings with at least two as and so may be 
written as (a V b)*a(a v b)*a(a v b)*. 


Obviously any language accepted by a deterministic automaton is accepted 
by a nondeterministic automaton since the set of deterministic automata is a 
subset of the set of nondeterministic automata. In the following theorem, how- 
ever, we shall see that any language accepted by a nondeterministic automaton 
is also accepted by a deterministic automaton. 


Theorem 3.1 For each nondeterministic automaton, there is an equivalent 
deterministic automaton that accepts the same language. 


We demonstrate how to construct a deterministic automaton which accepts 
the language accepted by a nondeterministic automaton. We shall later give a 
formal proof that a language is accepted by a deterministic automaton if and 
only if it is accepted by a nondeterministic automaton. If Q is the set of states 
for the nondeterministic automaton, we shall use elements of P(Q), i.e. the 
set of subsets of Q, as states for the deterministic automaton which we are 
constructing. Some of these states may not be used since they do not occur 
on any path which leads to acceptance state. Hence they could be removed 
and greatly simplify the deterministic automaton created. However, for our 
purpose, we are only interested in showing that a deterministic automaton can be 
created. 
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In general we have the following procedure for constructing a deterministic 
automaton 


M = (È, Q', {50}, Y’, F’) 
from a nondeterministic automaton. 
N = (È, Q, 50, Y, F). 


(1) Begin with the state {so} where so is the start state of the nondeterministic 
automaton. 

(2) For each a; € &, construct an a; arrow from {so} to the set consisting of all 
states such that there is an a;-arrow from So to that state. 

(3) For each newly constructed set of states s; and for each a; € X construct 
an a; arrow from s; to the set consisting of all states such that there is an a; 
arrow from an element of s; to that state. 

(4) Continue this process until no new states are created. 

(5) Make each set of states sj, that contains an element of the acceptance set 
of the nondeterministic automaton, into an acceptance state. 


Example 3.10 Consider the nondeterministic automaton N 


ee 
6) 6) ©) 


Construct an a-arrow from {so} to the set of all states so that there is an a-arrow 
from so to that state. Since there is an a-arrow from sọ to sp and an a-arrow from 
So to s1, We construct an a-arrow from {so} to {so, sı}. There is no b-arrow from 
so to any state. Hence the set of all states such that there is a b-arrow to one of 
these states is empty and we construct a b-arrow from {so} to the empty set Ø. 
We now consider the state {so, s1}. We construct an a-arrow from {so, sı} to the 
set of all states such that there is an a-arrow from either sg or sı to that state. 
Thus we construct an a-arrow from {sọ, s1} to itself. We construct a b-arrow 
from {so, sı} to the set of all states such that there is a b-arrow from either so 
or sı to that state. Thus construct a b-arrow from {sọ, s1} to {s2}. Since there 
are no a-arrows or b-arrows from any state in the empty set to any other state, 
we construct an a-arrow and a b-arrow from the empty set to itself. Consider 
{s2}. Since there is no a-arrow from sz to any other state, we construct an a- 
arrow from {s2} to the empty set. Since the only b-arrow from sz is to itself, we 
construct a b-arrow from {s2} to itself. The acceptance states consist of all sets 
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which contain an element of the terminal set of N. In this case {s2} is the only 
acceptance state. We have now completed the state diagram 





Pewee 
a) Ss 


which is easily seen to be the state diagram of a deterministic automaton. This 
automaton also reads the same language as N, namely the language described 
by the expression aa*bb*. 


Example 3.11 Given the nondeterministic automaton 





At this point we introduce a new notation. The ordered pair (s;, w) indicates 
that the automaton is in state s; and still has input w left to read. For example, 
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(s2, abbb) indicates that the automaton is in state s2 and must still read abbb. 
Assume that we have (s;, aw) w € &*. Thus the automaton is in state s; and 
must still read a followed by w. The notation (s;, aw) F (s;, w) means that the 
automaton has read a and moved from state s; to state sj. Therefore Y(s;, a) = 
sj. In the automaton 


a ab 


l 0 
“@LEO©*O*0 


we have (s2, bab) F (s3, ab). We also have 
(so, babba) | (sı, abba) F (so, bba) F (s1, ba) F (s2, a) F (so, A). 


If we have (s;, w;) F (sj, wj) F -++ F (Sm, Wm), We denote this by (s;, w;) F* 
(Sm, Wm). We also let (s, w) F* (s, w). Thus a word w is accepted by an automa- 
ton if and only if (so, w) H* (s, 4) where s is an acceptance state. In our example 
(so, bababb) F* (so, 4), so bababb is accepted by the automaton. 

We shall now prove that a language is accepted by a deterministic automa- 
ton if and only if it is accepted by a nondeterministic automaton. We begin 
with two lemmas. The first is obvious since every deterministic automaton is a 
nondeterministic automaton. 


Lemma 3.1 Every language accepted by a deterministic automaton is 
accepted by a nondeterministic automaton. 


Lemma 3.2 Let N = (2, Q, so, Y, F) be a nondeterministic automaton and 
M = (È, QO’, {so}, Y’, F’) be the deterministic automaton derived from N using 
the above process. Then (so, w) F* (s, à) in N if and only if there exists X such 
that ({so}, w) H* (X, A) in M wheres € X. 


Proof We first show that if (sọ, w) F* (s, à) in N, then ({so}, w) F* (X, A) 
where s € X. The proof uses induction on the length n of w. If n = 0, we 
have (so, 4) H* (so, A) in N, ({so}, à) F* ({so}, A) in M, and so € {so}, so the 
statement is true if n = 0. Assume w = va € XT has length k + 1, so v has 
length n. Since (so, va) F* (s, à), then (so, va) F* (t, a) F (s, 4) forsomet € Q 
and (so, v) F* (t, à). Therefore by induction, there exist Y so that t € Y and 
({so}, v) F* (Y, à). Sincet € Y and (t, a) F (s, A)in N, (Y,a) (X, A) for some 
X where s € X. Therefore ({so}, va) H* (Y,a) F (X, 4) or ({so}, va) F* (X, 2) 
where s € X. 
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Conversely, we show that if ({so}, w) F* (X, A) in M, then (so, w) F* (s, A) 
in N where s e X. We again use induction on n, the length of the word w. 
Assume there exists X such that ({so}, w) H* (X, à) in M where s €e X. If 
n = 0, we have ({so}, à) F* ({so}, 4) in M, (so, A) F* (so, A) in N, and so € {so}, 
so the statement is true if n = 0. Given ({s9}, va) with length k + 1, so v has 
length n. Assume ({so}, va) F* (Y, a) H} (X, A). Therefore ({so}, v) F* (Y, A). 
By induction, (so, v) F* (t, 4) forall tin Y and hence (so, va) F* (t, a) for allt in 
Y. By definition, since (Y, a) F (X, 4) and (t, a) F (s, A), then (so, w) F* (s, A) 
in N fors € X. 














We are now able to prove the desired Theorem 3.1. 


Theorem 3.2 A language is accepted by a deterministic automaton if and 
only if it is accepted by a nondeterministic automaton. 


Proof To show this we need only show that a word is accepted by a non- 
deterministic automaton if and only if it is accepted by the corresponding 
deterministic automaton. If (sọ, w) -* (s, à) where s is an acceptance state 
in the nondeterministic automaton, then ({so}, w) F* (X, à) where X contains 
an acceptance state. Hence X is an acceptance state. Assume X is an accep- 
tance state, then it contains an acceptance state r from the nondeterministic 
automaton. But by the previous lemma, if ({sọ}, w) H* (X, à) andr € X then 
(so, w) F* (r, A). Therefore r is an acceptance state. 














At this point we shall define an extended nondeterministic automaton and 
prove that a language is accepted by an extended nondeterministic automaton 
if and only if it is accepted by a nondeterministic automaton (and hence a 
deterministic automaton). 

Using a nondeterministic automaton, we can extend the automaton so that 
(£+, Q, so, T, F) consists of ©*, a finite set Q of states, and a function 
Y : Et x Q > P(Q), called the transition function. Thus Y reads words 
instead of letters. This can be changed back to reading letters by adding new 
nonterminal states. If Y reads the word w = ajaz --- ag, and moves from state 
s to state s’, add states 0703---o,, and let Y(s, a1) = o2, Y(o, a2) = 03, 
YT(o3, a3) = 04,...-, T(ox-1, ag—ı) = Ok, and Tox, ax) = 5’. This forms a 
nondeterministic automata, but we can form a deterministic automata with 
the same language as shown above. 

If we allow the automaton to pass from one state s; to another state s; without 
reading a letter of the alphabet, this may be shown as the automaton having an 
edge from s; to sj with label à. Thus paths may contain one or more A’s. Such 
an automaton is said to have A-moves. We can then have an automaton with the 
form (X*, Q, so, Y, F). 
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Formally a finite automaton M = (£, Q, so, Y, F) with A-moves has the 
property that Y maps Q U {A} to Q. We wish to create a deterministic automata 
M' = (£, Q', 54, T’, F’) containing no A-moves with the same language. Thus 
M(L) = M'(L). Given a letter q in ©, define E(q) to be all the states that 
are reachable from q without reading a letter in the alphabet. Thus E(g) = 
{p : (q, w) F (p, w). In our construction, the set of states of M’ is a subset of 
P(Q). The state sj = E(so), and F’ is a set containing an element of F. For 
each element a of £, define Y’ by Y’(P, a) = Uper E(Y(p,a)). 

We first show that M’ is deterministic. It is certainly single valued. Further 
Y'(P, a) will always have a value even if it is the empty set. 

We must now show that M(L) = M’(L). To do this we show that for any 
states p and q in Q, and any word w in &* 


(p, w) F* (q, A) in M if and only if (E(p), w) F* (P, à) in M’ 
for some P containing q. From this it will follow that 
(so, w) F* (f, A) in M if and only if (E(so), w) F* (P, A) in M’ 


for some P containing f, where f e F. 
We prove this using induction of the length of w. If |w| = 0, then w = å, 
and it must be shown that 


(p, 4) F* (q, A) in M if and only if (E(p), à) H* (P, à) in M’ 


for some P containing q. Now (p,à)H}* (q,à) if and only if q € E(p); 
but since M’ is deterministic and no letter is read, then P = E(p) and 
p € E(p). Therefore the statement is true if |w| = 0. 
Assume the statement is true for all strings having nonnegative length k. We 
now have to prove the statement is true for any string w with length k + 1. 
=>: Assume w = va for some letter a and w and (p, w) F* (q, à) so that 


(p, va) F* (q1, a) F (q2, A) F* (q, A) 


where at the end, possibly no letters of the alphabet are read. Since (p, va) H* 
(qı, a) then (p, v) F* (q1, A) and, by induction, (E(p), v) F* (R, à) for some R 
containing qı. But since (q1, a) F (q2, A), by construction, E(g2) © Y’(R, a), 
and since (q2, à) F* (g, 4), q € E(q2) by definition of E, and hence q € 
Y’(R, a). Therefore (R, a) ((P, à) for some P containing q by definition 
of Y’ and (E(p), va) H* (R, a) F ((P, A) for some P containing q. 

In M’, assume (E(p, va)) -* (R, a) F (P, A) where q € P and Y’(R, a) = 
P. By definition Y’(R, a) = U,er E(Y (r, a)). There exists some state r € R 
such that Y (r,a) = s and q € E(s). Therefore (s, 4) F* (q, A) by definition 
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of E(s). By the induction hypotheses (p, v) -* (r, A). Therefore (p, va) H* 
(r,a) F (s, à) F* (q, A). 


Example 3.12 Given the automaton (M = (£, Q, so, Y, F) 
a 9 b : Q 
A ANA 
-OLOAG 
a( Ya á 


which has A-moves, we construct M’ = (£, Q’, s, Y’, F’) containing no 
A-moves: E(so) = {80, 51, 52}, E(s1) = {51, 52}, E(s2) = {s2}, and E(s3) = 
{so, 51, 52, 53}. Denote these sets by sj, s1, 55, and s4 respectively. Then Y’ 
is given by the following table 











a b c 

1 k. 1 1 
SO s% si ss 
£ 1 1 
si Ø si s3 
f f 
Sy Ø Ø Sy 
1 $ 1 1 
s3 s3 si Sy 








giving the å- free automaton 
) 
Cc 
-Ox 
-© 
. ( 





a,b,c 
D 


Both automata generate the language a*b*c*. 


50 Automata 


Exercises 


(1) Which of the following words are accepted by the automaton? 


a 


0 Q 
SO) 


(a) abba. 
(b) aabbb. 
(c) babab. 
(d) aaabbb. 
(e) bbaab. 
(2) Which of the following words are accepted by the automaton? 


a 


t 
E VEN Ne 


a 


(a) aaabb. 
(b) abbbabbb. 
(c) bababa. 
(d) aaabab. 
(e) bbbabab. 
(3) Write an expression for the language accepted by the automaton 


a,b 


< Q 
ZORO 
= © 
x“ 
a 
(4) Write an expression for the language accepted by the automaton 
a 
Q 
OLOO) 


ja 
Oa 
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(5) Write an expression for the language accepted by the automaton 


pe b b af b 
-OFO 


(6) Write an expression for the language accepted by the automaton 


B AOC 


(7) Find a deterministic automaton which accepts the language expressed by 
aa*bb*cc*. 
(8) Find a deterministic automaton which accepts the language expressed by 
(a*ba*ba*b)*. 
(9) Find a deterministic automaton which accepts the language expressed by 
(a* (ba)*bb*a)*. 
(10) Find a deterministic automaton which accepts the language expressed by 
(a*b) v (b*a)*. 
(11) Find a nondeterministic automaton which accepts the language expressed 
by aa*bb*cc*. 
(12) Find a nondeterministic automaton which accepts the language expressed 
by (a*b) v (c*b) v (ac)*. 
(13) Find a nondeterministic automaton which accepts the language expressed 
by (a v b)*(aa V bb)(a v b)*. 
(14) Find a nondeterministic automaton which accepts the language expressed 
by ((aa*b) v bb*a)ac*. 
(15) Find a deterministic automaton which accepts the same language as the 
nondeterministic automaton 


-$--O=6) 


(16) Find a deterministic automaton which accepts the same language as the 
nondeterministic automaton 


a 


y 
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CEN 
“wee 
z 


a,b 


(17) Find a deterministic automaton which accepts the same language as the 
nondeterministic automaton 


By a, 
-OL OZOD” 
eee. aA 


(18) Find a deterministic automaton which accepts the same language as the 
nondeterministic automaton 


Q’ 
pa 
Ta 


a 


3.2 Kleene’s Theorem 
In this section we show Kleene’s Theorem which may be stated as follows: 


Theorem 3.3 A language is regular if and only if it is accepted by an auto- 
maton. 


We begin by showing that the rules defining a regular language can be 
duplicated by an automaton. First it is shown that there are automata which 
accept subsets of the finite set &. Then it is shown that if there are automata 
that accept languages Lı and L2, we can construct automata that accept Lı L2, 
Lý, and Lı U L2. Thus we show that for every regular language over a finite 
set XJ, there is a nondeterministic automaton that accepts that language. 
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The set is À; it is accepted by the automaton with state diagram. 


It may be preferable to create a state sı, which is not an acceptance state, and 
have all arrows from both so and sı go to sı. 

We next show that for every finite subset of X, there is an automaton that reads 
that subset. If the automaton has no acceptance state, then the language accepted 
is the empty set. The elements of the set {a1, a2, a3, ..., an} are accepted by 


the automaton with state diagram 
$ 
© Lee 
— 2 
os 


In particular if a € È, it is accepted by the automaton with state diagram 


-0O 


We next show that if regular languages with expression Lı and L3 are both 





accepted by automata, then their concatenation Lı L32 is also accepted by an 
automaton. Assume L; is accepted by the automaton Mı = (£, Q, so, Y, F), 
and L3 is accepted by the automaton Mz = (£, Q’, 59, Y’, F’). We shall con- 
sider both automata to be deterministic without loss of generality. We now 
define a new automaton M = (£, Q”, sj, Y”, F”) which is essentially the first 
automaton followed by the second automaton. Put simply, place the state dia- 
gram for M, after the state diagram for Mj. If, for a € ©, there is an a-arrow 
from any state s in the state diagram for M, to an acceptance state in the state 
diagram for Mı, then change the acceptance state into a nonacceptance state 
and also place an a-arrow from s to the starting state in the state diagram for 
Mhn. This is the state diagram for M. Thus the set of states Q” = Q U Q’, so that 
Q” consists of all the states in Mı and M2. We shall assume that M, and M2 
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have no common states. If they do, we can always relabel them. Since we want 
to begin in Mı, we let so be the starting state of M so that so = sọ. Since we 
want to finish in M32, we let the set of acceptance states be F’ so that F” = F”. 
We define the rules for Y” as follows. 

If the rule 


If in state s; and a is read, go to state s; 


is in Y and s; is not an acceptance state then include this rule in Y”. If s; is an 
acceptance state then include this rule in Y” and also include the rule 


If in state s; and a is read, go to state VE 


Hence there is the option of going to the state s; or skipping over to sj in the 
second automaton. Again recall that s; now ceases to be an acceptance state. 

If the rule is in Y’ then it is included in Y”. As a result, if the automaton M 
has read a word in L4, it may then skip over and read a word in L2. As a special 
case, consider the possibility of à being a word in L4. Include the rule 


If state sọ is an acceptance state, go to state sj. 


When these rules are followed, a word in L; L2 ends up in an acceptance state of 
M) and hence Y” so that it is accepted by M. Therefore every string consisting 
of a word in Lı followed by a word in Lo is accepted by M, and Lı L23 is 
accepted by M. 


Example 3.13 Let L, be the language described by the language (ab)*c and 
having automaton M, with state diagram 


"2 © 
ea 


Let Lz be the language described by the language ab*c* and having automaton 
Mp) with state diagram 


b 


Q 


Q 
-0O OO 


To find the state diagram for the language L Lo, place the state diagram for M3 
after the state diagram for Mı. Since there is a c-arrow from sọ to s2, and sz is 
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an acceptance state, add a c-arrow from so to sj. The state diagram 


b c 


Q 


, Q 
—— 


Cc 


is the state diagram for M, the automaton for Lı L2. 


Example 3.14 Let Lı and L2 and their respective automata be the same as 
those in the previous example. To find the automaton for the language for L2 Lı 
is slightly more complicated. First we place the state diagram for M; after the 
state diagram for M2. There is an a-arrow from sġ to sį and s| is an acceptance 
state, so place an a-arrow from sọ to so. There is a b-arrow from s{ to s| and s; 
is an acceptance state, so place a b-arrow from s; to so. There is a c-arrow from 
s| to s4 and s, is an acceptance state, so place a c-arrow from sj to so. There is 
a c-arrow from s%, to s4 and s) is an acceptance state, so place a c-arrow from s, 
to so. Then change s}, s4 so that they are not acceptance states. Thus we have 
the state diagram 


b c 


me 
oo 





which is the state diagram for M, the automaton for L3 L1. 


Similarly we show that if L is a language accepted by an automaton Mı = 
(È, Q, so, T, F) then L* is also accepted by an automaton. We now define a 
new automaton M = (x, Q’, sọ, Y’, F’) which is essentially the same as M, 
except M is looped to itself. Let M be defined as follows: Create a new state 
sọ, and make it an acceptance state. We include state sj so M will accept the 
empty word. For each rule 


If in state sp and a is read, go to state sj 
for a € &, add the rule 
If in state sj and a is read, go to state sj. 


Thus if there is an a-arrow from so to sj, there is an a-arrow from sọ to sj. 
Include all of the current rules for Mı in M. In addition, for a € £, if there is 
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an a-arrow from any state s in the state diagram for M, to an acceptance state 
in the state diagram for M,, then also place an a-arrow from s to sọ in the state 
diagram for M. This is the state diagram for M. The set of states Q’ = Q U {sọ}. 
The set of acceptance states for M is F U {so}. 

We thus define the rules for Y’ as follows: 


If in state so and a is read, go to state sj 
for a € &, add the rule 
If in state sj and a is read, go to state sj. 


If the rule is in Y, then include this rule in Y’. If s; is an acceptance state 
and 


If in state s; and a is read, go to state s; 


then also include the rule 


If in state s; and a is read, go to state so. 


Hence there is the option of going to the acceptance state s; or skipping over 
to So. 


Example 3.15 Let the diagram 


-@70=Q, 
“O7r© 


be the state diagram for the automaton accepting the language L, then the 
diagram 





is the state diagram for the automaton accepting the language L*. 
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Since the relationship between an automaton and its state diagram should 
be evident by now, in the future we will identify an automaton with its state 
diagram. Given automata M and M’ accepting languages L and L’ respectively, 
we now wish to construct an automaton M” which accepts L U L’. We wish 
to read a word simultaneously in M and M’, and accept it if it is accepted by 
either M or M’. 

We now show how to construct M” which accepts L U L’ given automata 
M = (È, Q, so, T, F) and M’ = (£, Q', 84, Y’, F’) If so and sẹ are the initial 
states of M and M’ respectively then construct a new initial state sf, which is an 
acceptance state if either sọ or sj is, and let M” = (£, Q”, sj, T”, F”) where 
Q” = QU Q' U {sj} and F” = F U F’. Let Y” = Y UY’ together with the 
following rules: If there is a rule 


If in state sp and a is read, go to state sj 
for a € È in Y include the rule 
If in state sj and a is read, go to state sj 


in Y”. 


If there is a rule 
If in state sọ and a is read, go to state s’ 
for a € ÈX in Y’ include the rule 
If in state sj and a is read, go to state si 
in Y”. 
Example 3.16 Let M be the automaton 


a 
a Q a b 
7 
-02O OS 
and M’ be the automaton 
a 4, iD a,b 
+), 
——V ~_y 


b 
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Using the above procedure we have the automaton M” which accepts the union 
of M(L) and M'(L) given by 


Eo > 
“\ gt9302 


Example 3.17 Let M be the automaton 


a ab 


one Q 
-02000 


and M’ be the automaton 


Using the above procedure we have the automaton M which accepts the union 
of M,(L) and M{(L) 
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An alternative method for finding the union of two automata is now given. 
If Q is the set of states for M and Q’ is the set of states for M’, let Q x Q’ = 
{(s;, si) : S; E Q, s’ € Q'} be the set of states for M. If there is an a;-arrow from 
si to s in M and an a;-arrow from s; to s/, in M’, then construct an a;-arrow 
from (s;, s;) to (sk, 5/,). In this way, we read the same letter simultaneously 
in M and M”. Since a word is accepted if it is accepted by either M or M’, 
if s; is a terminal state in M or si is a terminal state in M’, then we want 
(Si, si) to be a terminal state in M”. Therefore if F is the set of terminal states 
of M and F” is the set of terminal states of M’, the set of terminal states for 
M” is Q x F'U F x Q'. We require that both M and M’ be deterministic 
since we do not want M” to “hang” in one automaton before being accepted 
in the other. This is no restriction since we have shown that any language 
accepted by a nondeterministic automaton is also accepted by a deterministic 
automaton. 


Example 3.18 Let M be the automaton 


a 
a Q a b 
y 
-OLOS OS 
and M’ be the automaton 


As SOM 
ODE 
— ~y 
b 


It may be that all of the states in Q; x Q‘ are not needed. We begin with (so, sọ) 
as the start state. Since there is an a-arrow from sọ to sı and an a-arrow from 
sg to s}, we construct an a-arrow from (so, sọ) to (s1, s1). Since there is also a 
b-arrow from so to sı and a b-arrow from sj to s}, we construct a b-arrow from 
(So, 86) to (s1, s1). Since there is an a-arrow from s; to sı and an a-arrow from s{ 
to s}, we construct an a-arrow from (s1, s1) to (s1, 55). There is a b-arrow from 
sı to s2 and a b-arrow from s{ to s5, so we construct a b-arrow from (s1, s1) to 
(s2, 85). Continuing at (s1, s4), there is an a-arrow from s; to sı and an a-arrow 
from s/ to s}, we construct an a-arrow from (s1, 55) to (s1, 54). There is a b- 
arrow from sı to s2 and a b-arrow from s/ to s4, so we construct a b-arrow from 
(s1, 85) to (s2, 54). Continuing at (s2, s|), there is an a-arrow from sz to sı and 
an a-arrow from s| to s}, so we construct an a-arrow from (s2, s|) to (s1, 54). 
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There is a b-arrow from s2 to s2 and a b-arrow from si to 55, so we construct 
a b-arrow from (s2, s1) to (s2, s2). Finally, consider state (s2, s4). There is an 
1 2 2 
a-arrow from sz to sı and an a-arrow from s, to s4, so we construct an a-arrow 
from (s2, 54) to (s1, 54). There is a b-arrow from sz to s2 and a b-arrow from s, 
to s, so we construct a b-arrow from (s2, 55) to (s2, 54). The terminal states are 
2 2 2 
(81, 55), (82, S1), and (s2, 55). Thus M” is the automaton 


Note that aabb is accepted by M. In M”, reading aabb takes us from state 
(So, 5g) to state (s1, s1) to state (51,55) to state (s2, 54) to state (s2, s4). Since 
(s2, 5) is a final state in M”, M” accepts aabb. Note also that aba is accepted 
by M’. In M”, reading aba takes us from state (so, sọ) to state (s1, s|) to state 
(s2, s1) to state (s1, 54). Since (s1, 55) is a terminal state in M”, M” accepts 
aba. 


Example 3.19 Let M; be the automaton 
2 
A O 


and M; be the automaton 


a,b 
a a 
—_ —_ 
HONORO 
b b 
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Using the same process as in the previous example, we find that M is the 
automaton 





We have now shown that every operation in a regular language can be dupli- 
cated by an automaton. Hence we have the following lemma. 


Lemma 3.3 For every regular language L, there exists an automaton M so 
that L is the language accepted by M. 


The formal proof that a language accepted by an automaton is regular gives 
no procedure for actually converting an automaton to a language accepted by an 
automaton. Before giving a proof that the language accepted by an automaton 
M is regular, we first give some examples where, given a certain automaton, we 
can construct the language accepted by this automaton. This is not part of the 
proof but only an illustration. To perform this construction, we use transition 
graphs. These are merely finite state machines which read strings of a regu- 
lar expression rather than elements of X to change states. One of the regular 
expressions we shall use is the empty word à so that one may change states 
reading the empty word which is equivalent to changing states without read- 
ing anything. The form of the transition graph will become obvious as we use 
them. 

The process for constructing the regular expression is to first have only one 
initial and one terminal state. We then eliminate one state at a time from the 
state diagram and resulting transition graphs and in each case get a transition 
graph with e-arrows between states, where e is a regular expression. Eventually 
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we get a transition graph of the form 


ei 


e 


which accepts the expression e; V e€2V €3 V +-+- V €n. 
If there is more than one terminal state, say there are terminal states 





ti, to, t3,..., tm, then replace the states 





with 


Note that this new diagram accepts the same language as M. 
To eliminate the state s; we use the following rules. 


(1) If the diagram 


Ogg 
=> © = 
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occurs, replace it with 


Cad 
—> © — 


More generally if the diagram 


Qe ek 
+O 
occurs, where e;,€2,€3, --- , ex, are regular expressions, then replace it with 


the diagram 


Cie 
5 © = 
(2) If the diagram 
4 
>si 4s G) oy Sis 


occurs, then replace it with the diagram 





A ab*c 
—> Si-a —_ Si+ 


More generally if the diagram 


e 
e G e, 
T oe © T E, 


occurs, where e;,€2,e3 are regular expressions, then replace it with the 
diagram 


eje, ez 
—> sii ——— D — 


In particular, when e2 = A, then e;e5e3 becomes e;e3 so that the diagram 


ORORO 
—> — — 
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is replaced by the diagram 


6.) 4 
—_> — 


(3) If the diagram 


a 
w ae te 
~ © —> 
NCA 
occurs, then replace it with the diagram 


*(s ) aVbVc 


More generally if the diagram 
ei 
e, N 
-0&0- 
NEA 
ek 
occurs, where e;,€2,€3,--- ‚eg are regular expressions, then replace it with 
the diagram 
eVe V... Ve; 
Se Ga 


(4) If the diagram 


ae Cc 
“JEG 
— 
occurs, then replace it with the diagram 


a(ba)*c 
—> nn 


More generally if the diagram 


e 
— 83 

> 6.) @) Ge) 
5 
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occurs, where e;,€2,e3 are regular expressions, then replace it with the 
diagram 

e,(e,e,)"e, 
su 


(5) If the diagram 


SOLO 
OSE 


occurs, then replace it with the diagram 


s 
—> 

ac 
26 


More generally if the diagram 


PAS 


ez 
+O 
—> —> e 
— 
Se 


occurs, where e,e;,€2,€3,--- ,e, are regular expressions, then replace it 


with the diagram 
ee, 
wea 
ee, 
ae 


ee. 


j 


— &) 
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Example 3.20 Assume we begin with automaton 


e 


CEN 


© 


a 


a. 


We then add a new terminal state T to get the automaton 
b 


(i b 


OLS 
2 
— —> 
i r 
Eon 


We now apply rule (2) to get the automaton 


“6 


a(} 


— (%) 4 Ot 


Apply rules (2) and (3) to get the automaton 


__ab"ab_ *ab 


=F o) — ap*bata __ab*ha"a_, (7) 


Hence the regular expression is ab*ab v ab*ba*a. 


Example 3.21 Given the automaton 
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we go through the following "0 


nor man 
ye gO 


ge a 
-0L_ ZO 


(aa*řbV b)a 
to get the regular expression ((aa*b v b)a) V aa*b). 


Example 3.22 Given the automaton 


028C 
BO 


On 


À 
À 


we go through the following steps, 


S 
w o 


Con 


(av b)* 
-OQ2OGI® 
(av b)* 


Vb 
-04O 
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to get the regular expression (a V b)(a V b)*(a V b). Note that the process is not 
unique and that by taking different steps, we would have had a different, but 
equivalent, regular expression. Thus both expressions would have described the 
same set. 


We now give a formal proof of the following lemma 
Lemma 3.4 The language accepted by an automaton is regular. 


Proof Given a finite deterministic automaton M = (£, Q, so, Y, F) we wish 
to show that L = L(M), the language accepted by the automata, is regular. 
To do this we will express L as the union of a finite number of regular lan- 
guages, and since the union of regular languages is regular, L is regular. Let Q 
contain n elements q1, q2, . . -, qn, Where so = qi. For i, j = 1 to n and k = 1 
to n + 1, let R(i, k, j) be the set of all words w such that (q;, w) F* (qj, A) 
without passing through any qm where m > k. However, q; and q; do not 
have this restriction. Thus there is a path in the automata such that M in 
state q; reads w and is then in q; without passing through m where m > k. 
Thus if (qi, w) F* (qm, w’) F* (qj, à) then m < k or m =i and w = w or 
m = j and w’ = à. Hence the restriction is only on interior states of the path. 
Since there are only n states, R(i, n + 1, j) = {w : (qi, w) F* (qj, à). Hence 
L = U{R(i,m, j): jEQ. 














To complete the proof, we need to show that R(i, p, j)is regular for 1 < p < 
n + 1. We do this using induction. If p = 1, then there are no interior states in 
the path so R(i, p, j) = {a € X : (qi a) = qj} if i Aj and {À} U {aE È : 
5(qi, a) = qj} ifi = j. Hence we have a finite set of elements of X and possibly 
A in the set so it is a regular set. 

Assume R(i, k, j) is regular. The set of words R(i, k + 1, j) can be defined 
as 


R(i,k +1, j) = R(i, k, j)U RG, k, K)R(k, k, K R(k, k, j) 


where the path from q; to q; may not pass through a state qm where m > k or 
that passes along a path from q; to qg, then passes through zero or more paths 
from q; to qx and finally passes along a path from q; to q;. None of these paths 
passes along an interior state qm where m > k. Since R(i, k + 1, j) is formed 
using union, concatenation, and Kleene star of regular states, it is regular and 
hence L is regular. 

Since we have now shown that every regular expression is accepted by an 
automaton and that the language accepted by an automaton is regular, we have 
proven Kleene’s Theorem. 
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As a result of Kleene’s Theorem, we discover two new properties about the 
regular languages: 


Theorem 3.4 Jf L, and L3 are regular languages, then 
o* — Ly = {x:x € D* andx ¢ Ly} 
and Lı N La are regular languages. 


Proof To show &* — L; is regular, let M, be a deterministic automaton for 
Lı. To construct the automaton for ©* — L,, simply change all of the terminal 
states in Mı to nonterminal states and all of the nonterminal states to terminal 
states. As a result, all words that were accepted because the automaton stopped 
in a terminal state, are no longer accepted and all words which were not accepted 
are now accepted since the automaton will now stop in a terminal state after 
reading this word. 

To show that Lı N La is a regular language we simply use the set theory 
property that 


Li A Lz == ("= Lı) UO Sy). 


This is most easily seen by thinking of £* as the universe so that &* — Lı = Li 
and the statement is simply Lı N L2 = (LU L4)' which follows immediately 
from De Morgan’s law and the fact that L = L”. Since the set of regular lan- 
guages is closed under union and complement (first part of theorem), it is closed 
under intersection. 














Exercises 
(1) Let Lı be the language accepted by the automaton 


a 
a Q a b 
~<“— 
-OLOROA 
and L, be the language accepted by the automaton 
a a ZD a,b 

FO Os 

~ iy 
b 


(a) Construct the automaton which accepts the language Lı U L2. 
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(b) Construct the automaton which accepts the language Lı L2. 
(c) Construct the automaton which accepts the languages L} and the 
automaton which accepts L3. 
(2) Let L, be the language accepted by the automaton 


a 


a 

Or © 

ar aia 2 
OLO 


and L, be the language accepted by the automaton 


a,b 
a a 
KORONO 
b b 


(a) Construct the automaton which accepts the language L; U Lo. 
(b) Construct the automaton which accepts the language Lı L2. 
(c) Construct the automaton which accepts the languages L} and the 
automaton which accepts L3. 
(3) Let Lı be the language accepted by the automaton 


Do o 
“Or© 


and L3 be the language accepted by the automaton 
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(a) Construct the automaton which accepts the language Lı U Lo. 
(b) Construct the automaton which accepts the language Lı L2. 
(c) Construct the automaton which accepts the languages Lý and the 
automaton which accepts L3. 
(4) Let Lı be the language accepted by the automaton 


ae 
e 


and L3 be the language accepted by the automaton 


Ĝ A 


oO 
+ oP 


(a) Construct the automaton which accepts the language Lı U L2. 
(b) Construct the automaton which accepts the language Lı L2. 
(c) Construct the automaton which accepts the languages L} and the 
automaton which accepts L3. 
(5) Using transition graphs, construct the regular language accepted by the 


automaton. 
a a 
Be o> 
ae 
b 
Oe C 


>a 


© 


aj a,b 


2 
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(6) Using transition graphs, construct the regular language accepted by the 


automaton. 
os SO OF 
“ 3-56 
a,b 
(Nab 


(7) Using transition graphs, construct the regular language accepted by the 


automaton 


Se 


(8) Using transition graphs, construct the regular language accepted by the 
automaton 





3.3 Minimal deterministic automata and 
syntactic monoids 


In this section, we discuss minimal automata and the transformation monoid. 
We then show how they can be combined to produce the syntactic monoid 
of a language. We begin with the definition of an accessible deterministic 
automaton: 
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Definition 3.3 The state s of an automaton M is accessible if there exists a 
word w in &* so that if M reads the word w, it is then in state s. Equivalently 
the state s of an automaton M is accessible if there exists a word w in X* so 
that in reading w, M passes through state s. An automaton M is accessible if 
every state in the automaton M is accessible. 


Intuitively a state s is accessible if one can begin at sọ and follow a path of 
arrows to reach the state s. 

For the moment, we shall adopt the following definition of a minimal automa- 
ton with a finite number of states. 


Definition 3.4 A deterministic automaton M is a minimal if the number of 
states in M is less than or equal to the number of states in any other deterministic 
automaton accepting the same language as M. 


Assume Y is not empty. Obviously, if a state s in an automaton M is not 
accessible, it can be removed without changing the language accepted by M, 
but is M still deterministic? The answer is yes, if all of the states which are 
not accessible are removed. The reason is that if there is an a-arrow from s to 
s’ and s’ is not accessible, then s is not accessible, since any path from sp to s 
could be extended to a path from so to s’. 

An alternative way to deterine which states are accessible is to begin at the 
initial state sọ and list all states to which there is an arrow from sọ. Call this list 
X. Enlarge X by adding any state to which there is an arrow from some state 
already in X. Iterate until X is no longer enlarged. The list X is then the set of 
accessible states. 

A state h is co-accessible to a state g if there is no word of arrows from 
h to g. To find the co-accessible states, reverse the arrows and begin with the 
acceptance states. 

A minimal state has no states that are not accessible or co-accessible. So 
they may be removed. 

Therefore the first step in constructing a minimal deterministic automaton is 
to remove all states which are not accessible or co-accessible. Hence a minimal 
automaton is accessible and co-accessible. 

We will now give an algorithm for constructing the minimal automaton which 
accepts a given language. The first begins with an automaton for the language 
and constructs the minimal automaton. The second begins with an automaton 
accepting the language and uses it to construct the minimal automaton. 

At this point, we have several problems. The first is that removing states that 
are not accessable or not co-assessible does not necessarily give us a minimal 
automaton so we need to find out how to find a minimal automaton. (It does 
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however tell us if the language is empty or consists only of the empty word.) 
The second is, for a given language, whether the minimal automaton is unique, 
even up to isomorphism. 

At this point we will develop a method for developing a minimal automaton 
which has two advantages. It is developed using the language, so if we use this 
proceedure, and define a minimal automaton to be the one developed by this 
procedure, we will be able to consider the minimal automaton. The second 
advantage is that the procedure works for all languages although we already 
know that the automaton will have a finite number of states if and only if the 
language is regular. 

In addition we shall develop a monoid called the syntactic monoid which 
will be discussed later (see Theorem 3.7). We introduce it now because the 
development of the minimal automaton and syntactic monoid are interrelated. 

We now develop and define the intrinsic automaton and the syntactic 
monoid of an arbitrary language. 


Definition 3.5 As usual: X is the alphabet; X* is the set of all strings over 
È and L is a language over &, i.e., a subset of X*. Relative to the language 
L we define the intrinsic (or minimal) automaton M of L: The “states” of 
M are the equivalence classes defined, for each x € X*, by [x] = {y € &* | 
R(y) = R(x)}, where R(x) is the set of “right contexts” accepted by x relative 
to L. Specifically: R(x) = {v € X* | xv € L}. Each symbol a in & “acts on” 
the state [x] by [x]a = [xa] where xa = Y(x, a). M has a specified Start state, 
1, and a specified set of Acceptance states, {[x]|x € L}. We may view M as 
a directed arrow-labeled graph having the states of L as its vertices, having 
directed edges ([x], a,[xa]) where the second term is in X and is called the 
label of the arrow. The automaton M is considered to “recognize” each string 
in &*, which produces a path from a Start state to an Acceptance state. 


The constructed M recognizes precisely those strings that are in L. A lan- 
guage L is regular if its automaton has only finitely many states. Since for each 
word in the language, there is a unique path from the start state to an acceptance 
state, the intrinsic automaton is minimal with regard to the above definition. 


Definition 3.6 The syntactic monoid S of L has as its elements the equiva- 
lence classes defined, for each x € X*, by [[x]] = {y € E*| LR(y) = LR@)}, 
where L R(x) is the set of “two-sided contexts” accepted by x relative to the 
language L. Specifically, LR(x) = {(u, v) € &* x &* | uxv € L} and S has 
an associative binary operation that is “well defined” by setting [[x]][[y]] = 


[[xy]]. 
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Since [[1]] serves as a two-sided identity for this operation, S has the structure 
ofa monoid, i.e., semigroup with an identity element. The partition of X* into the 
classes [[x]] refines the partition of &* into the classes [x] since LR(x) = LR(y) 
implies R(x) = R(y). Consequently when S is finite L is regular. 

The action of & on the states of L extends, inductively, to an action of &* 
on the states of L. Consequently each string y € &* determines a function 
from the state set of L into itself defined by [u]x = [ux]. Two strings x and y 
determine the same function precisely if, for every u € X*, [ux] = [uy]. But 
this holds precisely if, for all v € &*, uxv € L if and only if uyv € L. Thus x 
and y determine the same function precisely if [[x]] = [[y]]. When Z is regular 
there can be only a finite number of functions from the state set of L into itself. 
Consequently when L is regular S is finite. 


Summary For every language L we have an intrinsically associated automa- 
ton that recognizes the language and we have an associated syntactic monoid. 
The following are equivalent: (1) L is a regular language; (2) the intrinsic 
automaton for L has only finitely many states; and (3) the syntactic monoid of 
L is finite. 

The word “intrinsic” is used because each language provides a unique 
automaton using this process (not just an isomorphism type — but one unique 
set of states, and arrows (or transitions). Thus if it is used there is no concern 
about isomorphism — the intrinsic automaton is 100% unique. 

Now the question of isomorphism can come up (as it certainly will in 
elementary automata theory) when one uses some arbitrary automaton that 
recognizes the language and a different process for finding the minimal automa- 
ton. We shall develop another process and show that when using this process 
on any automaton accepting the language, the minimal automaton is isomor- 
phic to the intrinsic automaton and hence the minimal automata developed 
for different automata for the same language produces isomorphic minimal 
automata. 

We shall now use an algorithm for “collapsing pairs of states” (without 
altering the language being recognized) until no further collapsing is possible. 
Thus producing a minimal automaton. 

Here is the procedure: 


Procedure 


Step 1 For each set of pairs of states {p, q}, determine if there is a string with 
length 0 that will take exactly one of these states into a final state (of course the 
other into a nonfinal state). In case of (the) length zero string, this just means, 
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determine whether one of these states is final and the other is not. If so p and 
q can NEVER be collapsed without altering the language accepted. Mark this 


my 


pair for “non-collapse”! 


Step 2 For each remaining UN-MARKED pair {p, q} and each symbol b in 
the alphabet, note {Y(p, b), Y(qg, b)). If Y(p, b) and Y(q, b) are distinct and 
the pair they form was Marked in the PREVIOUS round then p and q can 
NEVER be collapsed without altering the language recognized. Mark such pair 


oy 


for “non-collapse”’! 
Repeat step 2 until, when the step is completed no new pairs have been Marked. 


Note that for each pair {p,q} remaining unmarked at this stage: For any 
string s of symbols of the alphabet, Y(p, s) and Y(q, s) (starting in states p 
and q, the string s is read) must be either both final states or both non-final 
states. 

Note that the following defines an equivalence relation in the set S of states 
of the original automaton: p ~ q if {p, q} is an unmarked pair. 

Collapse the state set S of the original automaton onto the set S/ ~. A state 
in S/ ~ is final if it consists of final states of S. Each Y(p, s) = q of the original 
automaton provides Y’([p], s) = [q] in the (minimized) automata where [p], 
[q] are the ~ equivalence classes containing p and q. 


Example 3.23 Let M be the deterministic automaton 


6 
cee 


The unmarked pairs in step 1 are {50, so}, {51, 51}, {52, So}, {53, 53}, {S0, 51}, and 
{s2, 53}. The unmarked pairs in the first use of step 2 are {5o, so}, {81, 51}, 
{s2, 52}, {53, 83}, and, {s2,.53}, since there is an a-arrow from sọ to sı and an 
a-arrow from sı to sz and the states sı and sz are not in the unmarked pairs for 
step 0. Further uses of step 2 produce no new results. 

The equivalences classes are {{so}, {s1}, {s2, 53}}, and we are finished. In 
the graph shown below, only one element is picked from each equivalence 
class. 
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Therefore a minimal deterministic automaton is the automaton 


Example 3.24 Let M be the deterministic automaton 


0 O 


LG 
+ oP 


{so, So}, {81, Si}, (82, Sz}, {83, 53}, (84, S4}, {S0, 52}, (80, 51} and {s1, s2}. 


>) 


The unmarked pairs in step 1 are 


The unmarked pairs in the first use of step 2 are 


{s0, So}, {81, S1}, {52, So}, {53, 93}, (84, sa} and {s1, so}. 


The second use of step 2 produces no new results so the equivalence classes are 
{{so}, {s1, 52}, {53, sa}}, and we are finished. 

Therefore a minimal deterministic automaton is the automaton where an 
element is picked from each equivalence class. 


Q 
-00O 


Now, one can see that this minimized version of the arbitrary automaton 
recognizing the language L is virtually identical with the intrinsic automaton 
of the language. 


Theorem 3.5 For a given regular language L, the two minimal reduced 
automaton developed above accepting language L are isomorphic. 
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Proof M = (È, Q, sọ, Y’, F), the minimal reduced automaton developed by 
the collapsing method is isomorphic to the intrinsic minimal automaton. So 
M; = (%, Q;, [1], Yi, Fi). Define f : Q > Q; by 


f(x] = {w € &* : T(x, w) € [x]}. 
Thus 
f(x] = {w € &* : Y(so, w) = x for x € [x]}. 


Assume [x] = [y], then Y(x, u) € F if and only if Y(y, u) € F for u, v € 
E*, Let f([x]) = [w] and f([y]) = Cw’). 

Then wu € L if and only if w’u € L(= F;). Hence [w] = [w’] and f is 
well defined. Conversely, assume f([x]) = f([y]) then wu € L if and only if 
w'u € L(= F;) where Y (so, w) = x and Y (so, w’) = y. Hence Y (x, u) € F if 
and only if Y(y, u) € F and [x] = [y]. Hence f is well defined and one-to-one. 

Finally we must show that f(Y’([x], a) = Y;(f(x]), a), 


Y'([x], a) = [T@, a)], 
and 

Yi F(x), a) = fxDa. 
Let w € f([x], then Y(so, w) = x for x € [x]. Let 

Y@,a)=y € [Y@, a)] = V(x], a) 
and [y] = Y’([x]], a). Now Y(so, wa) = y, so 
FOD: = [wa]; = [wha = flea = (Ti fx), a) 

and so f (Y"([x], a)) = Yi(f (Lx), a). 


Corollary 3.1 For a given regular language, all reduced automata which 

















accept that language are unique up to isomorphism. 


Instead of looking at the syntactic monoid from the intrinsic point of view, 
as defined above we examine it using an automaton. In particular we look at 
minimal automata. 

The transformation monoid of a deterministic automaton 


M = ©, Q, SO, Y, F) 


is the image of a homomorphism ¢ from &* to a submonoid Ty of the monoid 
of all functions from Q to Q. Ifa € &, then g(a) = a where for each s; € Q, 
a(s;) = sj if there is an a-arrow from s; to sj, i.e. Y(s;, a) = sj. Ifa, b € X, 
then ab = ab where āb(s) = a(b(s)). More specifically, foru € X&*, u(s;) = sj 
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if (s;, u) F* (sj, 4). In other words, if the machine is in state s; and reads u, then 
it is in state sj. 
Let M be the automaton 


a 


(2 
(8 
oa 


b 
b 


then 


Aso) = 81 ā(sı)=s2 (82) = 83 
4(s3)= 53 b(so)=s2 Ds) = s3 
D(s2) = $2  b(s3) = s2. 


For convenience, permutation notation is used here although the functions are 
not usually permutations, since they are not one-to-one. Thus we have 


= So Sp $2 $3 z So S1 S2 $3 
a= and b= 
S1 S2 $3 $3 S2 $3 S92 SQ 


which we shall shorten to 


i 1 123 
a=(1 2 3 5 3) mab= (5 3 2 F 
3 
3 


By definition let À = ( i k ; ) We now perform the following products: 

a- 0 1 3 0 1 2 3\_/012 3 
-Uu 233 2322) NB 333 
5a = 0 2 3 0 12 3\ /0 1 2 3 
ENZI 22 O Be Bol ON 2 DD 
eae fo) 2.3 0 1 2 3\) (0 1 2 3 
EEL 2 3 3J 2 33/72333 
55 = 0 1 2 3 0 1 2 3\ /0 1 2 3 
FEND OB De DEEN De 3 Oe 2T N22 ot 

-z7 0 1 2 3 0 1 2 3 0 1 2 3 -7 

ab = (5 33 IE 22 eC 3. 3 3) =a 
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Continuing this process and letting y = ab, ô = aa, € = bb, and ¢ = ba, the 


table for the transformation monoid Ty is seen to be 














dn NS © 


IRIS eA 


I< I9 A 


NO ® 


SO ® 








Example 3.25 Let M be the automaton 
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By definition let A 
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The table for the transformation monoid Ty is seen to be 








>» 
a 
œl 
œ% 
M 

3 
D 
a 





TERT SBSH OR MAS 
EArt DINN SN TA > 
Sy SS SS Sis EN oo Si 
BSATEBSRBMVARTCOVNS 
SRFTBBBASOSET OCS lr 
BBIBBIssssass3sa3ae% 
Bo SSS RS OR es 
INGI BII AKNI OW GDH |w 
IZS S SSS 4°5 4 
ce SS Sa Se ee e 
BBIBBIsssssssa3aeq/e 
SBBBBSIsssasassasact 
32K SSeS eR oR 
IIS oe 2f seas Se | st 








Theorem 3.6 Let M(x, Q, so, Y, F) be a minimal deterministic automaton 
and Ty be the transformation monoid for M, then Ty is finite. 


Proof Eachelement of Ty is a function from Q to Q. If Q contains n elements, 
then there are n” possible functions from Q to Q. Therefore the order of M is 
less than or equal to n”. 














Theorem 3.7 The syntactic monoid of a regular language L is isomorphic 
to the transformation monoid of the minimal deterministic automaton M that 
accepts L. 


Proof Since, by the discussion following Definition 3.6, the syntactic monoid 
can be considered to be the transformation monoid of the intrinsic minimal 
deterministic automaton, and all minimal deterministic automata are isomorphic 
to the intrinstic minimal deterministic automaton, the transformation monoid 
is isomorphic to the syntactic monoid. 














We now examine some of the properties of the syntactic monoid of a lan- 
guage. Unlike the transformation monoid, as mentioned above, the syntactic 
monoid of a language also exists for languages that are not regular. 


Definition 3.7 Let @ be a homomorphism from X&* to a monoid Q. A set 
L C &* is recognized by Q if @-'@(L) = L. 
Theorem 3.8 Let L C &*. The following conditions are equivalent. 

(i) L is a regular language. 


(ii) The syntactic monoid Syn(L) is finite. 
(iii) L is recognized by a finite monoid Q.. 
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Proof (i)= (ii) If L is a regular language, then its syntactic monoid is isomor- 
phic to the transformational monoid of the minimal automaton generating L 
and hence is finite. 

(ii)=>-(iii) Assume ¢ is a homomorphism from &* to Syn(ZL). If w € L and 
p(w) = o(w’), then uwv € L if and only if uw'v € L for all u, v € X*. In 
particular Awa € L if and only if Aw’A € L. Sow € L, if and only if w’ € L. 
Therefore TIEL) = L. Since Syn(L) is finite, L is recognized by a finite 
monoid. 

(iii)=>(i) Assume L is recognized by a finite monoid Q and let @ : E* > Q. 
To show L is a regular language, we construct an automaton M(x, Q, so, Y, F) 
that accepts L. Let Q = Q. Define Y : E x Q > Q by Y(a, m) = mọ(a), for 
allm € Qanda € X. Let sọ = 1, the identity element of Q and F = $(Q). Then 
w € L(M) if and only if Y (w, 1) € (Q) if and only if w € @7'(@(Q)) = L 














Exercises 


(1) Find the minimal automaton which accepts the same language as the 
automaton 


Q 
; Gyre (Dab 
“Ceo -O 
46) 


(2) Find the minimal automaton which accepts the same language as the 


Or. 
LI 


b Oe 
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(3) Find the minimal automaton which accepts the same language as the 


automaton 
< ba 
“Ont 


(4) Find the minimal automaton which accepts the same language as the 
automaton 


6 
at 


© 


(5) Find the minimal automaton which accepts the language described by 
aa* (b v c). 

(6) Find the minimal automaton which accepts the language described by 
a(b v c)*bb*. 

(7) Find the minimal automaton which accepts the language described by 
(abc)*(b v c). 

(8) Find the minimal automaton which accepts the language described by 
(a v be)e(ab)*. 

(9) Find the syntactic monoid of the language accepted by the automaton 


a: a,b 


Pea Q 
“@LO*O*© 
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(10) Find the syntactic monoid of the language accepted by the automaton 


oe 
ox 


(11) Find the syntactic monoid of the language accepted by the automaton 


a 


S 
-OLOLO)S 


(12) Find the syntactic monoid of the language accepted by the automaton 


a (Jè 
-©2@426) 


(13) Find the syntactic monoid of the Bk accepted by the automaton 


a Q, 
“Or® 


3.4 Pumping Lemma for regular languages 





We now show that certain languages are not regular languages. To do so we 
first prove a lemma known as the Pumping Lemma. 


Lemma 3.5 (Pumping Lemma) Let L be an infinite regular language. There 
exists a constant n such that if z € L and |z| > n, then there exists u, v, w € &*, 
v Æ à such that z = uvw and uv‘w € L for all k > 0. The length of the string 
uw is less than or equal to n. Further if M is an automaton accepting the 
language L and M has q states, thenn < q. It is possible to have the stronger 
statement that z = uvw where the length of uv is less than or equal to q. 


Proof Let L be accepted by the automaton M = (x, Q, so, T, F). Let 
Y(s;, ai) = 5;41 for i = r to t, denote this by 


(s1, a,a2a3.. a) a (Si41, A). 
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Since L contains a word of length m, where m > q, say w = a142403 . . . Am. 
Note that if (s1, a1a2a3 . . . am) F* (Sm, A), then Sm is an acceptance state. Since 
m >q, in reading w, M must pass through the same state twice. Therefore 
(s1, a\a7a3.. caja) H* (Sk, à) and (s1, 414243.. .ag—1) HF (Sk, à) = for some 
j < k and both 


(Sj, ajaj41 ---am) F* (Sm, à) and (Sj, akak+1.-.. am) F* (Sm, À). 
Thus 
(S1, 410203 . . . Am) F* (Sm, A) and (81, a102 . . . A j—14kak41 - - - Am) F* (Sm, À). 


Also (sj, aja j42 . . . ak-1) F* sj, so in reading ajaj42...ax—1, M returns to the 
same state and 


(81, Aya... Aj—1(Ajj42--- Ak—1)" akaki - - - Am) F* (Sm, À). 


Letting u = aia? ...Aj—1, V = Ajaj42...dx—1, and W = AkAk+1 - - - Am, We 
have uv"w € L forn > 0. 

Since |uw| < |uvw| = m, if |uw| > q, we can repeat this process on uv 
until eventually we have u’(v’)"w’ € L for n > 0 where |u'w'| < q. Let v be 
the first cycle in z produced by the same state being passed through twice when 
the automaton is reading z. Then the length of uv is less than or equal to q. 
Note that it is no longer true that the length of uw is less than q. 














Using this lemma, we have the following theorem: 


Theorem 3.9 The language L = {a"b" : n > 1} is not regular. 


Proof Assume L = {a"b" : n > 1} is regular. Since L is infinite, there exist 
strings u, v, w € &*, v Æ à such that uv*w C L. There are three possibili- 


ties. First u = a™"—* 


, v = a*, and w = b” for some m. But then a*a” b” = 
a”tkb™” € L,whichisa contradiction. Second, u = a”, v = b¥, and w = b”. 
By asimilar argument, we reach a contradiction. Third u = a , v = atb" and 
w = bd". But then a”~*a*b"a¥*b"b™—" €e L, which is a contradiction. Hence 


L is not regular. 


m—k 














Exercises 


For each of the following sets, determine if the set is regular. If it is, describe the 
set with a regular expression. If it is not a regular set, use the Pumping lemma 
to show that it is not. 


(1) {a2"b" : n > 1}. 
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(2) {a"b*"a" : n > 1}. 
(3) {(ab)" :n = 1}. 
(4) {a"b"a" : n > 1}. 
(5) {a"b" :m,n > 1}. 
(6) {ww : w € X* and |X| = 2}. 
(7) fa" :n> 1}. 
(8) {w € {a, b}* : w contains an equal number of as and bs}. 
(9) {w € {a, b}* : w contains exactly four bs}. 
(10) {ww? : w € {a, b}* and the length of w is less that or equal to three}. 
(11) {ww* : w € {a, b}*. 
(12) {wcw? : w € {a, b, c}*. 
(13) {ww : w e (0, 1)* and ù is the 1s complement of w}. 
(14) {w € {a, b, c}* : the length of w = n?:n> 1}. 
(15) {w € {a, b, c}* : the length of w > n for some n > 1}. 
(16) {w € {a, b}* : w contains more as than bs}. 


3.5 Decidability 


In this section we answer the questions 


(1) Is there an algorithm for determining whether the language accepted by a 
finite automaton is empty? 

(2) Is there an algorithm for determining whether two finite automata accept 
the same language? 

(3) Is there an algorithm for determining whether two regular languages are 
the same? 

(4) Is there an algorithm for determining whether a language accepted by an 
automaton is infinite? 


The key to all of these questions is that they require the algorithm to be able 
to provide a yes or no answer. We are not concerned with the efficiency of the 
algorithm but only if within some finite length of time the algorithm can answer 
the question. Note that if an algorithm can determine that a statement is true (or 
false) within some bounded length of time, then the algorithm can determine 
whether the statement is true. 

We begin with a proof of the first question although we can see that if we 
can answer the second question, then we can answer the first question. Given a 
language L, as an expression, we simply determine the automaton that accepts 
L and see if the language accepted is empty. 
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Theorem 3.10 There is an algorithm for determining whether the language 
M(L) accepted by a finite automaton is empty. 


Proof Let M(L) haven states. Then M(L) is empty if and only if so is not an 
acceptance state and no string of length less than n is accepted since the shortest 
string accepted by M(L) cannot enter a state twice. Since there are only a finite 
number of these strings, they can be checked. 














Theorem 3.11 There is an algorithm for determining whether two finite 
automata accept the same language. 


Proof We already know that given automata M, and M, accepting languages 
M,(L) and M)2(L), respectively, we can construct automata for accepting 
languages Mı(L) O M2(L), and Mı(L) U M2(L). Combining these construc- 
tions, we can find an automaton which accepts (M1 (L) O M2(L)’) U (M2(L) N 
M,(LY), the symmetric difference of M,(L) and M>(L). But this set is empty if 
and only if M,(L) = M2(L). Hence we use the previous theorem to determine 
whether (Mı (L) N Ma(LY) U (M2(L) O M,(LY) is empty. 














Theorem 3.12 There is an algorithm for determining whether two regular 
languages are the same. 


Proof Given expressions for Lı and Lz , find the automata M, and M, so that 
Lı = M,(L) and Lz = M2(L). Now use the previous theorem to see if the two 
automata accept the same language. 














Before proving the next theorem, we need the following lemma. 


Lemma 3.6 Assume that an automaton M has n states. The language L 
accepted by M is infinite if and only if there is a word in L whose length 
is greater than n and less than 2n. 


Proof First assume L is infinite. By the Pumping Lemma there exists uv” w € 
L for all m > 0. Further if M is an automaton accepting the language L and 
M has n states, then |uw|, the length of the string uw, is less than or equal to 
n. Assume that after u is read, the machine is in state s. If while reading v, 
the machine returns to s, let v’ be the string that is read when the machine first 
returns to s and v’x = v. Thus if we have 


(so, uvw) F* (s, vw) F* (s, w) F* (s2, A), 
replace it with 


(so, uv'w) H* (s, v'w) H* (s, w) F* (s2, A). 
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Thus M reads the string so, u(v’)"w for any nonnegative integer n. If while 
reading v’, a state t is repeated, remove all of the states including one of the 
ts as well as the letters in v’ that were read in this cycle. Thus we are simply 
removing all cycles in v’. Call this string v”. Since reading v” uses no repeated 
states except s, the length of v” is less than or equal to n. Thus the length of 
uv’ w is less than or equal to 2n. If the length of uv’ w is less than or equal 
to n, there exists a least integer m so that the length of u(v’)"w is greater 
than n. Since the length of v” is less than n, the length of u(v")"w is less 
than 2n. 

Conversely in the proof of the Pumping Lemma, we showed that if there is 
a word in the language with length m greater than n, then for every positive 
integer r, the word uv’ w € L, where v is nonempty. Hence L is infinite. 


Theorem 3.13 There is an algorithm for determining whether a language 
accepted by an automaton is infinite. 


Proof Let M have n states. Then M(L) is infinite if and only if M accepts a 
string s with n < |s| < 2n. Since there are only a finite number of such words, 
check each of them to see if they are accepted by the automata. 


Theorem 3.14 There is an algorithm for determining whether a language is 
finite. 


Proof Using the proof of the previous theorem, if there is no string s accepted 
by M, withn < |s| < 2n, then M(ZL) is finite (where we include the empty set 
and the set containing only the empty word as finite sets). 














Theorem 3.15 There is an algorithm determining whether a language L © 
Lp. 


Proof We already know that there is an automaton that accepts Lı N L4, which 
is empty if and only if Lı C Lo. 














Exercises 


(1) Prove there is an algorithm for determining if regular language M(L) = 
x. 

(2) Prove there is an algorithm for determining if a regular language M(L) 
contains a word that contains a given letter of the alphabet. 

(3) Prove there is an algorithm for determining if every letter in the alphabet is 
contained in some word in a regular language L. 
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(4) Prove that for a positive integer n, there is an algorithm for determining if 
a regular language contains a word with length less than n that contains a 
given letter of the alphabet. 

(5) Prove that for a positive integer n, there is an algorithm for determining if 
every letter in the alphabet is contained in some word with length less than 
n in aregular language L. 

(6) Prove there is an algorithm for determining if a regular language contains 
a word that begins with a given letter of the alphabet. 

(7) Prove there is an algorithm for determining if there is a word in a regular 
language L of even length. 

(8) Prove that for any integer k there is an algorithm for determining if there is 
a word in a regular language L of length mk for some m. 

(9) Prove that for a regular language L, it is possible to determine if &* — L is 
finite. 


3.6 Pushdown automata 


In the previous section we mentioned that the set {a”b” : n is a positive integer} 
is not a regular language. Therefore it cannot be accepted by an automaton. 
Intuitively, the problem is that after the automaton has read the as in a word, 
it cannot remember how many it has read, so it does not know how many 
bs it should read. The automaton basically needs a memory so that it can 
remember the letters it has read. A pushdown automaton or PDA is essentially 
an automaton together with a very simple memory. The memory is called a 
pushdown stack. Associated with the stack is a set of symbols called the stack 
symbols. A stack symbol may be placed on the stack. This process is called 
pushing the symbol onto the stack. If x is a stack symbol, then push x simply 
means x is placed on the stack. The top symbol may also be removed from the 
stack. This is the last symbol placed on the stack. Since the last symbol placed 
in the stack is the first out, the stack is said to have the LIFO (last in—first out) 
property. Thus the symbols are removed from the stack in reverse order from 
the order they were put in the stack. The process of removing the top symbol 
from the stack is called popping the stack. If x is a stack symbol then pop x 
simply means that when the stack is popped, the symbol x is removed if it is 
on top of the stack. The purpose of the stack is to allow the PDA to remember 
the letters in the word that it has read so that it can duplicate them or replace 
them with other letters. 

Assume that the word to be read is placed on a tape. The tape is divided 
into little squares with the letters of the word in the first squares. The rest of 
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the tape is considered to be blank. Since the words may be arbitrarily long, it 
is best to use an infinite tape. These may have to be custom made. One of the 
advantages of mathematics is that mathematical structures do not usually have 
to be actually constructed. 

The PDA, beginning at the left, reads a letter at a time in the same manner as 
a standard automaton. The PDA may read a letter from the tape or pop (remove 
from the top) and read a symbol from the stack or both. Depending on its current 
state and the symbol(s) read, the PDA may change state, push a symbol in the 
stack, or both. 












































Tape 
a\blalblc Stack 
% ae a 
Processor b 
A A 
C 














We now define a PDA more formally. 
Let &* = DU {A} and J* = I U {A}. 


Definition 3.8 A pushdown automaton is a sextuple 
M=(2,0,5,1,Y, F) 


where & is a finite alphabet, Q is a finite set of states, s is the initial or starting 
state, I is a finite of stack symbols, Y is the transition relation and F is the set 
of acceptance states. The relation Y is a subset of 


(Œ> x Q x I*) x (Q x I’)). 


Thus the relation reads a letter from 5%, determines the state, and reads a 
letter from J’. It then changes state or remains in the same state and gives a 
letter of J* as output. Similar to the automata, the letter of a word is removed 
when it is read. The top letter on the stack is also removed when it is read. As 
discussed above, we say it is popped from the top of the stack. The letter of J 
produced by the relation is placed on top of the stack or pushed on the stack as 
discussed above. A word is accepted by the PDA if and only if after beginning 
in the start state, with an empty stack, the word is read, if possible, the machine 
is in an acceptance state, and the stack is empty. If all of the above do not occur, 
then the word is rejected. The language consisting of all words accepted by the 
pushdown automaton M is denoted by M(L). 
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Elements of Y have the following rules: 


((a,s, E), (t, D)) In state s, a is read and E is popped, go to state tand 
push D. 

((a, 5,4), (t, D)) TIn state s, a is read, go to state t and push D. 

(A, s, à), (s, D)) In state s, push D. 

((a,s, E), (t,à)) In state s, and a is read, pop E and go to state t. 

(à, s, E), (s,à)) In state s, pop E. 

((a, 5,4), (t,à)) In state s, read a and go to state t. 

((a,s,à),(s$s,à)) In state s, read a. 

((à,s, à), (t,4)) Move from state s to state t. 


Definition 3.9 M is a deterministic PDA if Y C ((X* x Q x I*) x (Q x 
I*)) has the property that if ((s, a, c), (s’, c’)) and ((s, a, c), (s”, c") € F then 
s’ = s" and d = c". 


Note that this definition differs between texts. 

Since the requirement that M is a deterministic PDA restricts the languages 
that it accepts, we will not consider the deterministic PDA. 

Although it seems a severe restriction, any language accepted by a PDA can 
be accepted by a PDA with only two states, which we will call s and t. The 
automaton leaves the first state before it reads the first letter and while the stack 
is still empty. The second state is then the terminal state. Often it is simpler or 
more convenient to use more states. An example of this will be shown in the 
examples. 

As with the regular automaton, we will show the PDA graphically. The PDA 
will be shown as a flow chart using only the instructions start, read, push, pop, 
and accept. It will be obvious that the flow charts below describe PDAs. We 
shall not try to prove that every PDA has a flow chart. Each edge of a flow chart 
has a state associated with it. For example in the following figure, (t) on the 
edge indicates that at that point on the chart, we are in state t. We could put 
the state with each edge, but we only do so when the state is changed. Thus the 
state is determined by the location on the flow chart. When only two states are 
used, including the start command which takes the PDA to the state ż, the state 
will not be indicated. The symbol 


() 
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indicates start and switch to state t. The symbol 


Y 
Push 


ØS 


M 

















indicates start, push S, and switch to state t. The symbol 
a 


indicates read a. The symbol 


Q- 


indicates pop a. The symbol 





f 








—>| Push 





indicates push a. Finally, the symbol 


( accept ) 


indicates accept if the word has been read, the machine is in an acceptance state, 
and the stack is empty. Thus the diagram 








O 
a >ya a 
OAOE 


accept 
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allows the PDA to read a and then push it or read b and pop a if it is on the stack. 
Thus every time a b is read, it removes an a which has been read and placed in 
the stack. In this example the alphabet and the stack symbols will both consist 
of a, and b. If, at any time, there were more bs read than as in the stack, there 
would be no a in the stack to remove and the PDA could not continue. If the 
number of as is equal to the number of bs when the word is read, then the stack 
will be empty. A word is accepted if, after popping a, the word has been read 
and the stack is empty. Therefore this PDA accepts words which have the same 
number of as and bs provided that, for every b in the word, the string preceding 
it contains more as than bs. For example consider the word aababb. We can 
trace its path with the following table: 











instruction stack tape 
start À aababb 
read À ababb 
push a a ababb 
read a babb 
push a aa babb 
read aa abb 
pop a abb 
read a bb 
push a aa bb 
read aa b 

pop a b 

read a À 

pop À Xr 
accept Xr Xr 








Example 3.26 The PDA 




















At 
a a $ 
push |«——_ 











Accept 
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accepts words containing the same number of as and bs. Consider the word 
abba. We can trace its path with the following table: 











instruction stack tape 
start À abba 
read À bba 
push a a bba 
read a ba 
pop À ba 
read À a 
push b b a 
read b Xr 
pop Xr Xr 
accept Xr Xr 








In the following example, three states are used. A move to a new state is 
indicated in the diagram by an arrow for which there is no loop or are no return 
arrows. 


Example 3.27 The PDA 

















push |< 




















push |< 

















accept 














accepts words ww* where w”? is the word w reversed. We read the first half of 
the word and then switch states to read the second half of the word. Consider 
the word abba. We can trace its path with the following table: 











state instruction stack tape 
s start À abba 
s read À bba 
s push a a bba 
sS read a ba 

s push b ba ba 

t read ba a 

t pop a a 

t read a À 

t pop Xr Xr 

t accept Xr Xr 
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Exercises 


(1) Which of the following words are accepted by the following pushdown 
automaton Mı? 





























(a) abbb 
(b) aabbb 
(c) aabbbbb 
(d) aaabbb 
(e) aabab 
(f) aaabbbb. 
(2) Usea table to trace each of the above words through the pushdown automa- 
ton Mı. 
(3) What is the language accepted by the pushdown automaton M1? 
(4) Which of the following words are accepted by the following pushdown 


automaton M>? 


() 
a a i b 
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(a) abb 
(b) aabbaaa 
(c) aabbbaa 
(d) aaabaaa 
(e) aabba 
(f) aabb. 
(5) Use a table to trace each of the above words through the pushdown automa- 
ton M2. 
(6) What is the language accepted by the pushdown automaton M)? 
(7) Which of the following words are accepted by the following pushdown 
automaton M3? 

















(a) abb 
(b) aabbaaa 
(c) aabbbaa 
(d) aaabaaa 
(e) aabba 
(f) aabb. 

(8) Use a table to trace each of the above words through the pushdown automa- 
ton M3. 

(9) What is the language accepted by the pushdown automaton M3? 

(10) Which of the following words are accepted by the following pushdown 

automaton M4? 
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push 

















accept 








(a) abb 
(b) bb 
(c) aabbbaaa 
(d) abbbaa 
(e) aabba 
(f) aabb. 
(11) Usea table to trace each of the above words through the pushdown automa- 
ton M4. 
(12) What is the language accepted by the pushdown automaton M4? 
(13) Given a pushdown automaton M = (£, Q, sọ, I, Y, F) where © = I = 
{a, b}, Q = {So, s1, 52}, F = {52}, and Y has the following relations: 





((a, So, à), (81, a)) In state so, a is read, go to state sı and push a 
((b, So, A), (s1, b)) 
((a, S1, A), (si, a)) 
((, s1, A), (s1, b)) 
(a, s1, A), (82, A) 
((a, S2, a), (s2, 2)) 
((b, s2, b), (sj, à)) 


(a) Complete the statements in the table. 
(b) Construct the flow chart for the PDA. 

(14) Given a pushdown automaton M = (£, S, so, L, Y, F) where Y = I = 
{a, b}, Q = {so, s1, s2}, F = {s2}, and Y has the following relations: 



































((a, So, à), (81, a)) In state so, a is read, go to state sı and push a 
((b, So, A), (si, b)) 
((a, S1, a), (s1, b)) 
((a, S1, b), (s1, b)) 
(G, sı, a), (s2, a)) 
((b, s1, a), (Sj, à)) 
((a, S2, a), (s2, a)) 
((b, s2, a), (Sj, à)) 
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(a) Complete the statements in the table. 
(b) Construct the flow chart for the PDA. 

(15) Let È = {a, b, c}. Construct a pushdown automaton that reads the lan- 
guage L = {wcw" : w € {a, b}*}. 

(16) Let È = {a, b, c}. Construct a pushdown automaton that reads the lan- 
guage L = {a”cb" : n is a nonnegative integer}. 

(17) Let & = {a, b, c}. Construct a pushdown automaton that reads the lan- 
guage L = {ww" : w € {a, b}*}. 

(18) Let & = {a, b, c}. Construct a pushdown automaton that reads the lan- 
guage L = {wcw" : w € {a, b}*}. 

(19) Let È = {a, b, c}. Construct a pushdown automaton that reads the lan- 
guage L = {w : The number of as in w is equal to the sum of the number 
of bs and cs}. 

(20) Let & = {a, b}. Construct a pushdown automaton that reads the language 
L = {w : The number of as in w is equal to twice the number of bs or the 
number of bs in w is equal to three times the number of as}. 

(21) Given two pushdown automata 


r = (N, T, S, P) 
and 
T’ = (N’, Y, Ss’, P’) 


over the same alphabet £ and accepting languages L and L’ respectively, 
(a) Describe how to construct a pushdown automaton T; that accepts the 
language L U L’. 
(b) Construct a pushdown automaton T; that accepts the language L U L’ 
where L is the language accepted by the automaton in Example 3.26 
and L’ is the language accepted by the automaton in Example 3.27. 
(22) Given two pushdown automata 


r= (N, T, S, P) 
and 
T’ = (N’, DBA Ss’, P’) 


over the same alphabet £ and accepting languages L and L’ respectively, 

(a) Describe how to construct a pushdown automaton T, that accepts the 
language LL’. 

(b) Construct a pushdown automaton T» that accepts the language LL’ 
where L is the language accepted by the automaton in Example 3.26 
and L’ is the language accepted by the automaton in Example 3.27. 
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(23) Given a pushdown automaton I’ = (N, T, S, P) over the alphabet & and 
accepting language L, 
(a) Describe how to construct a pushdown automaton I3 which accepts 
the language L*. 
(b) Construct a pushdown automaton T3 that accepts the language L U L’ 
where L is the language accepted by the automaton in Example 3.26 
and L’ is the language accepted by the automaton in Example 3.27. 
(24) Given two pushdown automata 


r=(N,Y,S, P) 
and 
T% = (N', Y, Ss P’) 


over the same alphabet £ and accepting languages L and L’ respectively, 

Construct a pushdown automaton T; that accepts the language L U L’ 
where L is the language accepted by the automaton in Example 3.26 and 
L' is the language accepted by the automaton in Example 3.27. 


3.7 Mealy and Moore machines 


Previously, we defined a deterministic automaton, a device which only accepts 
or recognizes words of a language of &*. We now produce two machines which 
are similar to deterministic automata, but produce output. 

The first machine we introduce is called a Moore machine, created by E. F. 
Moore[30] and is denoted by (£, A, S, so, T, @). It also has a finite set of states 
S including a starting state so. It contains two alphabets & and A. The first is 
the alphabet of input characters to be read by the machine. The second is the 
alphabet of output characters produced by the machine. The Moore machine 
retains the transition function Y : S x & — S of the finite state automaton. 
It also contains an output function @ : S — A. In the operation of a Moore 
machine, the output is first produced using the output function ¢ before the 
transition function F is used to read the input and change states. Imitating the 
deterministic automaton, the Moore machine reads each element of a string w of 
characters of X until it has read the entire string. During this process, it produces 
output consisting of a string of characters of A. Since the Moore machine 
produces output ¢(so) before the first input character is read and produces 
output from the last state reached before the transition function tries and fails to 
read input, the output string contains one more character than the input string. 
Also since (so) is always executed first, each output string must begin with 
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(so). As with the deterministic automata, we say a Moore machine reads a 
symbol a of the alphabet & to indicate that the letter a is used as input for the 
function Y. Similarly, in state s;, if the output is ¢(s;), we shall say that the 
machine prints the value (s;), although the output may be used for an entirely 
different purpose. Thus one may envision a Moore machine reading a string in 
È from a tape and printing a string in Ax on the tape or on another tape. 

As with the finite state automaton, we shall illustrate the Moore machine 
using a finite state diagram. As in the deterministic automaton, if Y(s;, a) = sj, 


If é(s;) = z, this is represented by 


this is represented by 


so that both s; and ¢(s;) are represented inside the vertices of the diagram. 


In the diagram 
a a 
OrP@"@S 
—_ 
a 


b 


x = {a,b}, A = {0, 1}, S = {50, 51, 52}, T is given by the table 











F So Sy S2 
a SO S0 S2 
b S1 Sy S2 








and ¢ is given by the table 











s Hs) 
SO 1 
Sy 0 


S2 0 
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Given the input string aba, the machine first prints the value (so) = 1. It 
then reads a and remains in state Y(a, so) = so. It then prints 6(s9) = 1. Next 
it reads b and travels to state Y(b, so) = sı. It then prints ¢(s;) = 0. Next it 
reads a and travels to state Y(a, s1) = so. It then prints #(s9) = 1. Since there 
is no more input, operations cease. The result is the output string 1101. The 
input string aabab produces the output string 111010. The input string baab 
produces the output string 10110. 

Note that the Moore machine we have produced is actually the finite 


automaton 
a a 
l 
_ 
x 
a 
b 


except that we have added ø with the property that @(s;) = 0 if s; is not an 
acceptance state and (s;) = 1 if s; is an acceptance state. When we do this, 
the last character printed will be 1 if and only if the input is accepted by the 
finite automaton. Thus since the outputs for aba and ababa are 1101 and 
110101 respectively, aba and ababa are accepted by the automaton. Using 
this procedure we can “duplicate” any finite automaton with a Moore machine 
where a word is accepted only if the last character output is 1. It may also be 
observed that whenever a | appears in the output, the initial string of input which 
has been read at that point is accepted by the finite automaton since the state 
at that point is an acceptance state. For example, in the above example input 
aabaabbab produces output 0001001001, so aab, aabaab, and aabaabbab 
are all accepted by the automaton. Since ¢(so) = 1 the empty word is also 
accepted. In general, the number of 1s in the output of a Moore machine which 
“duplicates” a finite automaton is the number of initial strings of the input which 
are accepted by the finite automaton. 


Example 3.28 The automaton 
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has corresponding Moore machine 





Input babbab produces output 0010010 so substrings ba and babba are 
accepted by the automaton. Since the input ababbbaa produces output 
000100000, the only substring accepted by the automaton is aba since only 
one | occurs. 


Example 3.29 A unit delay machine delays the appearance of a bit in a string 
by one bit. Hence the appearance of a character in the output is preceded by 
one character in the input. The following machine is a unit delay machine. 


i 


So far, we have primarily shown that a Moore machine may be used to 
“duplicate” a finite automaton. This is only one of the uses of a Moore machine. 
However, any task performed by a Moore machine can be performed by another 
machine called a Mealy machine and conversely. In most cases the task is more 
easily shown using a Mealy machine. 

The Mealy machine also contains an output function, however, the input is 
an edge rather that a state. Since the edge depends on the state and the input, 
the output function ô “reads” a letter of a € & and the current state and prints 
out a character of the output alphabet. Hence ô is a function from S x © to A. 
More formally a Mealy machine is a sextuple M; = (£, A, S, so, T, 6) where 
x, A, S, so, and Y are the same as in the Moore machine andé:S x A > È. 
The Mealy machine is also best illustrated using a finite state diagram. Since 
ô depends on both the state and the letter read, we shall denote the output by 
placing it on the edge so that 





Guo) 


alz 


O TO 
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corresponds to Y(s;, a) = s; and ô(s;, a) = z. Note that, unlike in the Moore 
machine, the output occurs after the input is read. Hence for every letter of 
input, there is a character of output. 

Consider the Mealy machine 
































Y | so s1 s2 

a S1 S2 S2 

b S2 SO S1 
and 

ô SO Sy S2 

a|1 1 0 

b 0 0 











Given the input string aaabb, a is read, 1 is printed, and the machine moves to 
state sı. The second a is read, | is printed, and the machine moves to state s2. 
The third a is read, 0 is printed, and the machine remains at state s2. The letter 
b is read, 0 is printed, and the machine moves to state sı. Finally, b is read, 0 
is printed, and the machine reaches state so. Thus input aaabb produces output 
11000. 


Example 3.30 The Mealy machine 


simply converts every a in the string to x, every b to y, and every c to z. Thus 
aabbcca is converted to xxyyzzx. 
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Example 3.31 The 1s complement of a binary string converts each 1 in the 
string to a 0 and each 0 to a 1. It is given by the state diagram 


Example 3.32 If 1 is added to the 1s complement of a binary string of length 
n, we obtain the 2s complement used to express the negative of an integer 
if we discard any number carried over beyond n digits. Thus 1111+ 1 = 
0000. 

The following Mealy machine adds | to a binary string in this fashion. The 
input string must be read in backwards and the output is printed out backwards 
so the unit digit is read first. The stage diagram 


0/0 
O20 
. 


1/0 1/1 


describes the Mealy machine. In this diagram, sı is the state reached if there is 
no | to carry when adding the digits. The state s2 is reached if there is a 1 to 
carry when adding the digits. Let 1101 be the number in reverse. (Hence the 
actual number is 1011.) First input 1 is read. The output is 0 and the machine 
is in state s2. (This corresponds to 1 + 1 = 10 so 0 is output and 1 is carried.) 
Now input | is read. The output is 0 and the machine remains in state s2. 
(This corresponds to 1 + 1 = 10 so O is output and 1 is carried.) Next 0 is 
input. The output is 1 and the machine moves to state sı. (This corresponds to 
1+ 0= 1 so 1 is output and nothing is carried.) Finally 1 is input. The output 
is 1 and the machine remains in state sı. (This corresponds to 1 + 0 = 1 so 
1 is output and nothing is carried.) Thus the output is 0011 and the number 
is 1100. 


Example 3.33 The Mealy machine M, adds two signed integers. The signed 
integer m is subtracted from the signed integer n by adding n to the 2s com- 
plement of m. Thus M, can also be used for subtraction by first using the 
machine in the previous example to find the 2s complement of the number to 
be subtracted. Assume an, dy_1,..., do, a1 and by, by_1, ..., b2, by are the two 
strings to be added. We again assume that the two strings to be added are read 


3.7 Mealy and Moore machines 105 


in reverse so the first two digits to be input are a; and b,, followed by az and 
b2, ..., followed by a, and b,. We shall consider the pair of digits to be input 
as ordered pairs, so that (a;, b1) is the first element of input. The machine M} 
is 


(0,0)/0 (0,1)/0 







(1,0)/0 


ony? (1,0/1 (1,1/1 


The machine is in state sọ when no 1 has been carried in adding the previous 
input and is in state sı when a 1 has been carried in the addition. Assume that 
0101 and 1101 are added. First (1, 1) is read, so the machine moves to sı and 
prints 0. Next (0, 0) is read, so the machine moves to so and prints 1. Then (1, 1) 
is read, so the machine returns to sı and prints 0. Finally (1, 0) is read, so the 
machine remains at sı and prints 0. Note that the 1, if it exists, which is carried 
from adding the last two digits is discarded. Thus the sum of 0101 and 1101 is 
0010. 


Earlier in this section, we implied that Moore machines and Mealy machines 
were equivalent in the sense that every Moore machine could be duplicated by 
a Mealy machine and conversely. More specifically, given a Moore machine, 
there is a Mealy machine which will produce output equivalent to the Moore 
machine when given the same input. Conversely given a Mealy machine, there 
is a Moore machine which will produce the output equivalent to the Mealy 
machine when given the same input. 

We first need to specify what we mean by equivalent output since a Mealy 
machine always has one less symbol of output than the Moore machine. A 
string of output of a Mealy machine is equivalent to a string of output of a 
Moore machine if it is equal to the substring of the Moore machine excluding 
the first symbol (so). Thus if the Moore machine produced output 010010101, 
the equivalent output from the Mealy machine would be 10010101. 

The transformation from the Moore machine to an equivalent Mealy machine 
is the simplest. With the transition 
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ina Moore machine, given input c, the character ag will be printed, the machine 
will move to state sı, and a; will next be printed. In the transition 


OO 


of a Mealy machine, the machine will move to state sı with input c and a, will 
be printed. Since we disregard ao in the string produced by the Moore machine 
in our definition of equivalent output, we have begun with the same output. 
Assume that we have the transition 


in a Moore machine and a; has already been printed. Input b moves the machine 
to state s;, and the next output will be a;. The corresponding transition in the 


Mealy machine is 
07O 


which produces the same transition and output. 


Example 3.34 The Mealy machine corresponding to the Moore machine 


b a b 
= wi)! one 
a 
is 


b/1 a/0 b/0 








a/\ 
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In transforming a Mealy machine to a Moore machine, we have to consider 
the problem where arrows into a given state produce different output. Consider 
the following example: 


alx ciz 


aa 
i 


In a Moore machine, the state s produces unique output so it cannot produce 
both x and y as output. We solve this by making two copies of s 


aa 


One will produce x as output and the other y as output as follows. Obviously 
both machines produce output x with input a and output y with input b. For 
simplicity, we shall simplify s*/x to s/x and s’/y to s/y noting that they are 
different states. 

In general, for each state s, except the starting state, in a Mealy machine and 
for each output symbol z, we shall produce a copy s/z of the state s. This may 
result in some overkill since in the above example, if the output symbols were x, 
y, and z, we would not have needed state s/z since there was no arrow entering 
s with output z. We begin with initial state sọ and give it an arbitrary output 
variable x9 from the set of output variables since it is not used in producing 
output equivalent to the Mealy machine. If we have 


a/x 
CZE 
c/z 


in the Mealy machine, we replace it with 
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in the Moore machine. For other states, we replace 


a/x b/y 
a) 
a b 
as 


with 


We produce the same output at each step for both machines. 
Thus the machine equivalent to 


‘ a/0 
a/\ 
(0) wo l Si 
b/0 a/1 


is strcj.eps 





Exercises 


(1) Let the Moore machine M, = (£, A, S, So, Y, ġ) be given by the 


diagram 
b a,b 
b 


a 
Di 
— > 


Describe A, £, and S. Find tables for F and @¢. 
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(2) Let the Moore machine M, = (2, A, S, so, T, ġ) be given by the 
diagram 





Describe A, £, and S. Find tables for Y and @¢. 
(3) Let the Moore machine M, = (2, A, S, so, T, ġ) be given by the 


diagram 
b 
5 (a) 
—- (1) a a 
$, 


b a 


(a) Find the output with input bbabab. 
(b) Find the output with input aaabbaba. 
(c) Find the output with input bbbaaa. 
(d) Find the output with input A, the empty word. 
(4) Let the Moore machine M, = (2, A, S, so, T, ġ) be given by the 
diagram 





(a) Find the output with input abcabca. 

(b) Find the output with input bbbaaacc. 

(c) Find the output with input aabbccaa. 

(d) Find the output with input A, the empty word. 
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(5) Find the Moore machine that duplicates the finite automaton 


Ô a,b 


ae 
-OLO+€ 


(6) Find the Moore machine that duplicates the finite automaton 


a 
b b b 
ie Gace OAC 
a 


(7) Find the Moore machine that duplicates the finite automaton 
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(9) Let the Mealy machine Me = (2, A, S, so, Y,5) be given by the 
diagram 


b/0 


b/0 
a Go 
b/1 
OOL A 
all 


Describe A, &, and S. Find tables for Y and ô. 
(10) Let the Mealy machine Me = (£, A, S, So, Y, ô) be given by the 
diagram 





Describe A, £, and S. Find tables for Y and ô. 
(11) Let the Mealy machine M, = (£, A, S, so, Y, ô) be given by the 
diagram 





(a) Find the output with input abaabbab. 

(b) Find the output with input bbaaba. 

(c) Find the output with input aabbaaa. 

(d) Find the output with input A, the empty word. 
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(12) Let the Mealy machine Me = (£, A, S, so, T, ô) be given by the diagram 





(a) Find the output with input abcccbab. 
(b) Find the output with input bbaabc. 
(c) Find the output with input aaccbba. 
(13) Given the Moore machine M, = (£, A, S, so, T, Q) 





find the equivalent Mealy machine. 
(14) Given the Moore machine M, = (£, A, S, 59, T, @) 





find the equivalent Mealy machine. 
(15) Given the Mealy machine M, = (£, A, S, so, T, 5) 
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find the equivalent Moore machine. 
(16) Given the Mealy machine Me = (È, A, S, so, T, ô) 





find the equivalent Moore machine. 


(17) Construct a Mealy machine which directly subtracts a signed binary num- 
ber from another signed binary number. 


(18) Let Z5 = {0, 1, 2, 3, 4} be the set of integers modulo 5, where the “sum” of 
two integers is found by adding the numbers and finding the remainder of 
this sum when divided by 5. Therefore 3 + 4 = 2and2 + 3 = 0. Construct 
the Moore machine that gives a sum of initial strings of elements of Zs. 


Thus the input 2140321 produces output 02322023. 


4 


Grammars 


4.1 Formal grammars 


A grammar is intuitively a set of rules which are used to construct a language 
contained in &* for some alphabet £. These rules allow us to replace symbols 
or strings of symbols with other symbols or strings of symbols until we finally 
have strings of symbols contained in & allowing us to form an element of 
the language. By placing restrictions on the rules, we shall see that we can 
develop different types of languages. In particular we can restrict our rules to 
produce desirable qualities in our language. For example in our examples below 
we would not want 3 + +4 — x6. We also would not want a sentence Slowly 
cowboy the leaped sunset. Suppose that we begin with a word add, and that we 
have a rule that allows us to replace add with A + B and that both A and B can 
be replaced with any nonnegative integer less that ten. Using this rule, we can 
replace A with 5 and B with 3 to get 5 + 3. There might also be an additional 
rule that allows us to replace add with a different string of symbols. 

If we add further rules that A can be replaced by A+ B and B can be 
replaced by A x B, we can start by replacing add with A + B. If we then 
replace A with A+ B and B with A x B, we get A + B + A x B. We can 
continue this process getting longer and longer strings, so that we can continue 
to build strings of arbitrary length, but eventually we will want to replace all of 
the As and Bs with integers. As noted above, we have choices in the replacement 
of A and B so there is not necessarily any uniqueness in replacing a symbol or 
string of symbols. Hence grammars are not deterministic. If we have derived 
A + A x B and choose to replace the As and Bs with integers we do not have 
to replace both by the same As with the same value. If we replace the first A 
with 3, the second A with 5, and B with 7, we have 3 + 5 x 7. 

Note that in the above rules add, A, and B can be replaced by other symbols 
while + and the integers cannot be replaced. The symbols that can be replaced 
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by other symbols are called nonterminal symbols and the symbols that can 
not be replaced by other symbols are called terminal symbols. We generate 
an element of the language when the string consists only of terminal symbols. 
The rules which tell us how to replace symbols are called productions. We 
denote the production (or rule) which tells us that add can be replaced with 
A+B 


add—> A+B. 


Thus the productions for our first example above are 


add > A+B 
A—>A+B 
B—>AxB 
A—>0 Bo> 0 
A—>1 Bo 1 
A—>2 Bo 2 
A>9 B > 9. 


Below, we shall expand our rules to do arbitrary addition, subtraction, multi- 
plication, and division of integers. 
A grammar is formally defined as follows: 


Definition 4.1 A formal grammar or phrase structure grammar T is denoted 
by the 4-tuple (N, X&, S, P) which consists of a finite set ofnonterminal symbols 
N, a finite set of terminal symbols £, an element S € N, called the start symbol 
and a finite set of productions P, which is a relation in (N U X)* such that each 
first element in an ordered pair of P contains a symbol from N and at least one 
production has S as the left string in some ordered pair. 


Definition 4.2 If W and W' are elements of (NU XY, W = uvw, W' = 
uv'w, and v > v' is a production, this is denoted by W => W’. If 





W>W > W>. > W, 


forn > 1, then W, is derived from W,. This is denoted by W, =>* W,„ and is 
called a derivation. If the number of productions in not important we simply 
use W, =>* W,,. The set of all strings of elements of & which may be generated 
by the set of productions P is called the language generated by the grammar 
T and is denoted by T (L). 


To generate a word from the grammar I’, we keep using productions to derive 
new strings until we have a string consisting only of terminal elements. 
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Thus in our example above, 
N = {add, A, B}, 
Xu = {+, x, 0, 1, 2,3, 4,5, 6, 7, 8, 9}, 
S = add, 
and 
P = {(add, A + B), (A, A + B), (B, A x B), (A, 0), 
(A, 1),...,(A, 9), (B, 0), (B, 1),..., (B, 9}, 


where we will denote (add, A + B) by add > A + B, (A, A + B) by A > 
A + B, etc. If we eliminate the production (B, A x B), the language generated 
by T is the set of all formal expressions of finite sums of nonnegative integers 
less than 10. 


Example 4.1 In the grammar described above, derive the expression 


2+4+7x 6. 
Begin with the production 
add —> A+B 
to derive 
A+B. 
Then use the production 
B—>AxB 
to derive 
A+AxB. 
Then use the production 
A>A+B 
to derive 
A+A+Bx B. 


Then use the productions 


A>2 A->4 Bo7 B->6 
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to derive 2 + 4 +7 x 6. Note that we cannot derive 
3x2+44+7x 6. 


Example 4.2 Suppose we want a grammar which derives arithmetic expres- 
sions for the set of integers {0, 1, 2,3, 4,5, 6, 7, 8, 9}. Thus the language gen- 
erated by the grammar is the set of all finite arithmetic expressions for the 
set of integers {0, 1, 2,3, 4,5, 6, 7, 8, 9}. Examples would be 3 x (5 + 4) and 
(4+ 5) + (3^2), where ^ denotes exponent. As mentioned above, we obviously 
want to exclude expressions such as 3 + x6 and 3 + +6 x 4 — 5. Let the set 
N = {S, A, Bhand& = {+, —, x, +, ^, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, (, )}. We will 
need the following productions: 


S— (A + B) B —> (A + B) 
S —> (A— B) B —> (A— B) 
S — (A x B) B — (A x B) 
S —> (A+B) B —> (A — B) 
S — (A^B) B — (A^B) 
A — (A+B) A—>0 
A — (A— B) e 
A— (AxB) A>9 
A —> (A+B) B->O 
A — (A^B) i 

B- 9. 


We will use the grammar to derive the arithmetic expression 
(2+ 3)+(4+5)). 
We begin with the production 
S — (A+B). 
We then use the productions 
A — (A+B) 
and 
B —> (A+B) 
to derive 


(A+ B)+(A+ B)). 
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The productions 
A—>2 and B—3 
give us 
((2+3)+(A+ B)). 
Finally we use the productions 
A—>4 and B—5 
to derive 
(2+ 3)+(4+5)). 
We next use the grammar to derive the arithmetic expression 
((3°2) + (5 x 7)). 
We begin with the production 
S — (A+B). 
We then use the productions 
A —> (A^B) and B — (AxB) 
to derive 
((A^B) = (A x B)). 
The productions 
A—>3 and B—>2 
give us 
((3°2) + (A x B)). 
Finally we use the productions 
A—>5 and B77 
to derive 
((3°2) + (5 x 7)). 


Example 4.3 In a similar manner, we may form arithmetic expressions in 
postfix notation. Let the set N = {S, A, B} and 


x = {+,-, x,/,7,0, 1, 2,3, 4,5, 6, 7, 8, 9}. 
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We will need the following productions: 


S — AB+ A —> AB+ B — AB+ A>0 


S —> AB— A —> AB— B —> AB— : 

S —> ABx A — ABx B —> ABx A->9 
S —> AB+ A —> AB+ B — AB+ B—0 
S — AB^ A —> AB^ B —> AB^ 


Bo 9. 
Consider the expression 3 2+4 7+ x. Since our integers are all less 
than ten, 3 2+ represents the integer symbol 3, followed by the integer 
symbol 2 and the + symbol. To construct this expression we begin with the 
production 


S—> ABx. 
We then use the productions 
A—> A+B and B>A+B 
to derive 
AB+AB+x. 
The productions 
A—>2 and B-3 
give us 
2 3+AB+x. 
Finally we use the productions 
A—>4 and Bo7 
to derive 
2 3+4 74x. 


Example 4.4 A grammar may also be used to derive proper sentences. These 
sentences are proper in the sense that they are grammatically correct, although 
they may not have any meaning. Suppose we want a grammar which will derive 
the following statements, among others: 


Joe chased the dog. 
The fast horse leaped over the old fence. 
The cowboy rode slowly into the sunset. 
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Before actually stating the grammar let us decide upon its structure. This 
allows us to be assured that each sentence in the grammar is a grammatically 
correct sentence. Each of our sentences has a noun phrase (noun p), a verb 
phrase (verb p), and another noun phrase. In addition the last two sentences 
have a preposition (prep). Therefore let the first production be 


S —> < noun p >< verb p >< prep >< nounp>. 


In our example, the most general form of a noun phrase is an article followed 
by an adjective and then a noun. Therefore let the next production be 


< noun phrase > —> < art >< adj >< noun > 


where “art” represents article and “adj” represents adjective 
The most general form of a verb phrase is a verb followed by an adverb. 
Therefore let the next production be 


< verb p > > < adv >< verb > 


where “adv” represents adverb. 

At this point, we know that the terminal set & = {Joe, chased, the, The, 
dog, fast, horse, leaped, over, old, fence, cowboy, rode, slowly, into, sunset}. 
The nonterminal set N = {S, <noun p>, <verb p>, <art>, <adj>, <noun>, 
<adv>, <verb>, <prep>}. 

We next need productions which will assign values to <art>, <adj>, 
<noun>, <adv>, and <verb>. In some of our sentences we do not need 
<art>, <adjective>, <prep>, and <adv>. To solve this problem, we include 
the productions 


<at> —ìÀ < adj > > à < adv > > À < prep > > À. 


By assigning these symbols to the empty set, we simply erase them when 
they are not needed. The remainder of our productions consists of the following: 


<art> — the < noun > — horse < noun > — fence 
<adj> — fast < noun > — dog < ady > — slowly 
<adj> — old < noun > — cowboy < verb > — chased 
< noun > —> Joe < noun > — sunset < verb > — leaped 
< verb > — rode < prep > — over < prep > — into 


<at> — The. 
To derive the sentence “Joe chased the dog,” we begin with 


S —> < noun p >< verb p >< prep >< noun p > 
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to derive 


< noun p >< verb p >< prep >< nounp >. 


Using the production 


< noun p > —> < article >< adjective >< noun > 


we derive 


< art >< adj >< noun >< verb p >< prep >< noun p >. 
Using 
<art> —> À <adj>— 2 
we derive 
< noun >< verb p >< prep >< noun p >. 
Repeating the process for the second <noun phrase>, we derive 
< noun >< verb p >< prep >< art >< noun >. 
Using 
< verb p > —> < adv >< verb >, 


we derive 


< noun >< adv >< verb >< prep >< art >< noun >. 


Using 
<adv>— à <prep>— A 
we derive 
< noun >< verb >< art >< noun >. 
Using 


< noun > — Joe < noun > —> dog < verb > —> chased < art > — the 


we derive “Joe chased the dog.” 
To derive the sentence “The fast horse leaped over the tall fence,’ we again 
begin with 


S —> < noun p >< verb p >< prep >< noun p > 


to derive 


< noun p >< verb p >< prep >< nounp >. 
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Using the production 
< noun p > >< art >< adj >< noun > 
we derive 
< art >< adj >< noun >< verb p >< prep >< nounp >. 
Using 


< art > — the <art>— The <adj>-— fast < noun > — horse 


we derive 
The fast horse < verb p >< prep >< noun p >. 
Using 
< verb p > —>< adv >< verb >, 
we derive 


The fast horse < adv >< verb >< prep >< nounp >. 


Using 
< adv > > À < verb > — leaped 
we derive 
The fast horse leaped < prep >< noun p >. 
Using 
< prep > — over, 
we derive 


The fast horse leaped over < noun p >. 
Using the production 
< noun p > >< art >< adj >< noun > 
we derive 
The fast horse leaped over < art >< adj >< noun >. 
Using 


< art > — the < adj > — tall < noun > — fence 
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we derive 


The fast horse leaped over the tall fence. 
Derivation of the last sentence is left to the reader. 


Definition 4.3 For each production P > w,w2w3... wn the corresponding 
tree is 


P 


JIS 


w. 


Thus the corresponding tree for S > A + B is 


IN 


A +B 


Definition 4.4 If the corresponding trees of the productions used to derive a 
given expression are connected, they form a tree with root S, called the parse 
tree or the derivation tree. If A — B occurs in the derivation then there is an 
edge from A to B in the tree. The symbols A and B are called vertices or nodes. 
The vertex B is called the child of A. Note that a terminal at a vertex has no 
children. Such a vertex is called a leaf of the tree. The leaves of the tree, when 
read left to right, form the word generated by the tree. If Ay — A, > ---— An 
forms a string of edges in the tree then there is a path of length n from Ao to An. 


Example 4.5 In Example 4.1, we used productions to derive 3 + 2 + 4. 
To construct the tree, begin with the first production used 


add > A+B 


to form corresponding tree 


add 


A /N 


Then use the corresponding tree 


A 


7S 


A+B 
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of the production 
A>A+B 
to form the tree 
add 


A as 
A 


A+B 


Then use the corresponding tree in 


A AS, 


of the production 
BoAxB 
to get the corresponding tree 


add 


JS 
A + B 
[SANS 
A+BAxXxB 
Then use the corresponding trees of the next productions 


A>2 Bo4 A->7 B->6 


to form the parse tree 


Example 4.6 In Example 4.3, to derive ((2+ 3) x (4+5)), we use the 
productions 


S— (A x B) A—> (A+B) A>2 Bo 3 
B — (A+B) A—>4 B > 5. 
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Therefore the parse tree is the tree 


S 


N 


(A x B) 
IM AX 
(A nig i ii 
l 3 4 5 


Example 4.7 In Example 4.4, to derive the sentence “Joe chased the dog,” 
using productions 


S —> < noun p >< verb p >< prep >< noun p > 


to get 
< noun p >< verb p >< prep >< noun p > 
< noun p > >< article >< adjective >< noun > 
to get 
< art >< adj >< noun >< verb p >< prep >< noun p > 
Using 
<at> —àÀ < adj > > à 

we get 


< noun >< verb p >< prep >< noun p > 

Again using 

< noun p > >< article >< adjective >< noun > 
and 

<at> < adj > —> à 

we get 

< noun >< verb p >< prep >< art >< noun >. 

Using 


< verb p > > < adv >< verb >, 
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we get 
< noun >< adv >< verb >< prep >< art >< noun >. 
Using 
< adv > > À < prep > —> À 
we get 
< noun >< verb >< art >< noun >. 
Using 


< noun > —> Joe < noun > —> dog < verb > —> chased art > the 


we have the correspondence tree for “Joe chased the dog.” 


S 
(noun p) (verb p) (prep) (noun p) 
EIN Z| | Ar 
(art) (adj) (noun) (adv) (verb) A (art) (adj) (noun) 
S a a ft | 
A à Joe A chased the A dog 


Example 4.8 In Example 4.4, to derive the sentence “The large dog leaped 
over the old fence,” we use productions 


S —> < noun p >< verb p >< prep >< noun p > 


< noun p > >< art >< adj >< noun > 


< noun p > > < art >< adj >< noun > < adj > — fast 

< verb p > > < adv >< verb > <adv>— i 

< prep > over < art > —> The 

< adj > — tall < noun > — fence 
< noun > —> horse < verb > — leaped 


< art > — the. 
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Thus the parse tree is 


S 
(noun p) (verbp) (prep) (noun p) 
ale el X gl 
(art) (adj) (noun) (adv) (verb) (over) (art) (adj) (noun) 
| || 
The fast horse A leaped the tall fence 


In all of the grammars in this section, the productions have been of the form 
A — W, where A is a nonterminal symbol. Therefore the production can be 
used everywhere that A appears, regardless of its position in an expression. 
Such grammars are called context-free grammars. A language generated by 
a context-free grammar is called a context-free language. If a grammar has 
a production of the form aAb —> W where A is a nonterminal and ab Æ A 
then this production can only be used when a is on the left-hand side of A and 
b is on the right-hand side. It therefore cannot be used whenever A appears and 
so it is dependent on the context in which A appears. Such a grammar is called 
a context-sensitive grammar. 

In the following examples, we consider context-free grammars which gen- 
erate more abstract languages: 


Example 4.9 Let [=(N,2z,S,P) be the grammar defined by N = 
{S, A, B}, & = {a, b} and P be the set of productions 


S — AB A->a B —> Bb Bou A> i A— aA. 


Using the production S — AB, we derive AB. Next using the productions 
A — a and B — jd, we derive a. If we use the productions 


S— AB A> dx B —> Bb Boi 


in order, we derive b. We can also generate aabbb, aaaa, aaab, and bbbbb. 
In fact, we can generate a”b" for all nonnegative integers m,n. Hence the 
expression for the language generated by I is a*b*. 


Example 4.10 Let T’ = (N, £, S, P) be the grammar defined by N = 
{S, A}, & = {a, b} and P be the set of productions 


S — aAb A — aAb A>À. 
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Using the productions S$ — aAb and A — à we derive ab. Using the 
productions 


S > aAb A— aAb A>i 
in order, we derive aabb or a?b?. Using the productions 
S > aAb A— aAb A— aAb A— ab 


in order, we derive aaabbb or a*b’. It is easily seen that the language generated 
by T” is {a"b" : n is a positive integer}. Note that this is not the same as a*b* 
since this would also include ab” where m and n are not equal. 


Example 4.11 Let T” = (N, £, S, P) be the grammar defined by N = 
{S, A, B}, & = {a, b} and P be the set of productions 


S— ABABABA A —> Aa A>i B > b. 


It can be shown that the expression for the language generated by I” is 
a*ba*ba*ba*. This is the language consisting of all words containing exactly 
three bs. 


In Example 4.10 we generated the language {a"b” : n is a positive integer}. 
Intuitively, we can see that this is not a regular language since the only way that 
we can generate an infinite regular language using a finite alphabet is with the 
Kleene star x. In this case the only possibility is a*b* but, as mentioned earlier, 
this does not work since this would also include ab” where m and n are not 
equal. 

One might ask if there is a particular type of grammar which generates only 
regular languages. The answer is yes, as we shall now show. 


Definition 4.5 A context-free grammar T = (N, È, S, P) is called a regular 
grammar if every production p € P has the formn — w where w is the empty 
word i or the string w contains at most one nonterminal symbol and it occurs 
at the end of the string if at all. 


Therefore w could be of the form aacA, ab, à or bA, where a, b, and 
c are terminals and A is a nonterminal. However, w could not be of the 
form aAb, aAB, or Aa. The production n — abcA could be replaced by the 
productions 


n— aB B —> bC C > cA. 


Also it is possible w could contain no terminal and one nonterminal so we have 
B — C, but if this is followed by C —> tD, where t is a terminal, then we 
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can combine the two productions to get B — t D. Hence it is no restriction to 
require each production to be one of the following forms: 


A— aB Bob Cox 


where A, B, and C are nonterminal elements, and a and b are terminal elements. 
More formally, we define a linear regular grammar as follows: 


Definition 4.6 A context-free grammar T = (N, =, S, P) is called a linear 
regular grammar if every production p € P has the form n —> w where the 
string w has the form xY, x or à where x € Land Y E N. 


Theorem 4.1 A language is generated by a linear regular grammar if and 
only if it is generated by a regular grammar. 


Proof Obviously every language that is generated by a linear regular grammar 
is generated by a regular grammar. To show every regular grammar is generated 
by a linear regular grammar, we divide the proof into two parts. We first show 
the language of a regular grammar can be generated by productions of the forms 


A— aB B>b Cox C>D 


where A, B, C, and D are nonterminals and a,b are terminals. Let [ = 
(N, x, S, P) be a regular grammar and L be the language generated by 
T. Let T’ = (N’, £, S, P’) be the grammar formed by replacing every pro- 
duction A —> a1a2aā3 ...an—1 B by the set of productions A —> a1 A1, Ai > 
a2Å2,..., An—1 > Gn—2An—2, An —> An—1B where Aj, A2,..., An are new 
nonterminal symbols. Let L’ be the language generated by I’. By construction 
we have A =>* a)d2q3...d,—, B. So any word of L will be created by the gram- 
mar T”. Conversely if A >* a1a203 . . .an—1 B is formed by productions A > 
a, A,, Ay > a2 A2, ..., An—1 > Gn—2An—2, An —> an-ı B, then there must be 
a production A > a\a7a3...a,_,BinT since the symbols A;, Az,..., A, are 
symbols which appear only in forming A > a,a2a3...a,_,B. 

Hence we can now assume that a regular grammar can be formed using only 
productions of the form 


A— aB B>b C> C>D 


where A, B, C, and D are nonterminals and a, b are terminals. We want to 
show that we can form a regular grammar without productions of the form 
C — D where C and D are both nonterminals. Call this a 1-production. Let 
T be a regular grammar formed using the productions above and L be the 
language generated by I’. Assume that we have productions of the form above. 
Let T” be the grammar with all 1-productions deleted and insert the production 
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A, —> Anjb if 
A, — Az, Az > A3,-++, An—2 > An—1, An-1 > DAn 


occurred in L. 
If 


A, > Az, Ar > A3,-++, An—2 > An—1, An-1 > D 


occurred in L, insert the production A; > b in L”. 
If 


Aj > A2, A2 > A3, sos, An—2 => An—1; An-1 >i 


occurred in L, insert the production A; > à in L”. Let L” be the language 
generated by the grammar I”. Certainly L C L”. 

Assume we have S =* w where the productions are from I” or F or both 
and w € *. If all of the productions are from I’, then w € L. If not then there 
exists uB = vC in the sequence where B — aC is not a production of I and 
v = ua. Take the first such production. Therefore there exist productions B > 
Ay, Ay > Ao, Ar > A3,--+, Apr > An_1, An_) > aC in T and we can 
replace uB > vC with uB > uA, => uA2---uAn_, => uaC = vC, where 
all of the productions are in I’. Since there are only a finite number of produc- 
tions not in I’, we can continue this process until all of the productions are in 
T and w € L. Therefore L’ C L”. 














We now proceed to prove the following theorem. 


Theorem 4.2 A language is regular if and only if it is generated by a regular 
grammar. 


Since a language is regular if and only if it is accepted by an automa- 
ton, all we need to know is that a language is generated by a regular gram- 
mar if and only if it is accepted by an automaton. We first show how 
to construct a regular grammar which generates the same language that is 
accepted by a given deterministic automaton and we then show how to con- 
struct an automaton which accepts the language generated by a given regular 
grammar. 

Normally when we consider a word being read by an automaton, we probably 
think of the automaton as removing letters from the word as it reads it. Thus if 
the word to be read is abbc, and there is an a-arrow from state sọ to sı, then 
we read a, move to state sı, and still have bbc left to read. If there is a b-arrow 
from state sı to s2 then we read b, move to state s2, and still have bc left to read. 
If there is a b-arrow from state s2 to s3 then we read b, move to state s3, and 
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still have c left to read. Finally, if there is a c-arrow from state s3 to s4 then we 
read c, move to state s4, and have nothing left to read. If s4 is a terminal state, 
then we accept the word abbc. 

We may also think of an automaton adding letters to words rather than 
removing them. Suppose that we consider the string that has been read rather 
than the string left to read. In the example above, at state sı, we have read a. 
At state s2 we have read ab. At state s3 we have read abb, and at state s4 we 
have read abbc. Thus at each state we are adding a letter. Consider the grammar 
r =(N, &, so, P), where N = {5, S1, 52, 53, 54}, © = {a, b, c}, and P is the 
set of productions 


So > ası sı > bs S2 > bs3 53 —> CS4 s4 —> À 


where we have a production s4 — à only if s4 is a terminal state. It is easily 
seen that I generates the word abbc. Thus to change an automaton to a regular 
grammar, if there is a k-arrow from s; to sj, in the corresponding grammar, 
form the production s; —> ksj. If sj is an acceptance state, add the production 
S; — à. We shall shortly show that grammar will generate the same language 
accepted by the automaton. 


Example 4.12 Given the automaton, 
SAA 
“©) 


we form the productions for the corresponding grammar as follows: 


—> 


Description of automaton Production 
There is an a-arrow from so to sı 59 > ası 
There is a b-arrow from sọ to s1 So > bs; 
There is an a-arrow from sı to sg S1 —> asọ 
There is a b-arrow from sı to s2 sı > bs2 
There is an a-arrow from s2 tos} S2 > ası 
There is a b-arrow from so to s2 So —> bs2 


The state s is an acceptance state s) —> À. 


Hence the corresponding grammar is T = (N, X, so, P), where N = 
{s0, 81, 52}, & = {a, b}, and P is the above set of productions. 
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Example 4.13 Given the automaton 
ON 
a aj aN 
=a) b oe 
Om 
(Ya 


we form the productions for the corresponding grammar as follows: 


Os 


Description of automaton Production 
There is an a-arrow from sọ to s1 So > ası 
There is a b-arrow from so to s2 So —> bso 
There is an a-arrow from sı to s2 s1 > as2 


There is a b-arrow from sı to s3 sı > bs3 
There is an a-arrow from s2 tOo S2 s2 > as2 
There is a b-arrow from s2 to s3 S2 > bs3 
There is an a-arrow from s3 tOo S2 S3 > as2 
There is a b-arrow from s3 to s3 53 —> bs3 


The state s2 is an acceptance state s2 —> A 
The state s3 is an acceptance state s3 — A. 


Hence the corresponding grammar is I = (N, =, so, P), where N = 
{s0, $1, 82, 53}, T = {a, b}, and P is the above set of productions. 


Given an automata M = (A, S, so, T, F), we now give a formal definition of 
the grammar T'u = (N, &, S, P), associated with an automaton and then show 
that the language accepted by M is generated by Tm. 


Definition 4.7 Ty, =(N,7,S,P), the grammar associated with the 
automaton M = (=, Q, so, T, F) has N = Q, and sy = S. The production 
si > as; isin P if and only if F(a, si) = sj, and sj — À if and only if sj is an 
acceptance state. 


Lemma 4.1 The language Lı accepted by M is equal to the language L3 
generated by Ty. 


Proof From the above definition, we have s; —> as, if and only if (s;, a) F 
(sj, à). Thus if (s;, ab) F (sj, b) F (sk, à) in M then s; > as; > abs, in Ty. 
More generally, (s;, w) F* (sg, A) if and only if s; >* ws,. 

We first show Lı C L2. Assume w is accepted by M, then (so, w) F* (sk, à) 
where sg is an acceptance state. Since (so, w) F* (sg, A), we have so =>* ws, in 
Tm. Since sx is an acceptance state, sk —> A is a production. Therefore sy =>* w 
and w is generated by Tm. 
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Conversely, let w be generated by I m. Let so >* w, then so >* ws, > w. 
Therefore sx — À is a production, and by definition of Cy, sx is an acceptance 
state. Since so =>* w, we have (so, w) F* (sk, 4) in M. Therefore w is accepted 
by M. 














Given aregular grammar T in linear regular grammar form, we now construct 
an automaton which accepts this linear grammar. Given T = (N, X, S, P), 
intuitively we add an additional nonterminal ¢ to N and for each production 
B — a, where a is a terminal, we remove this production and replace it with the 
productions B — at and t — à. Obviously this does not change the language 
of the grammar. Let M = (2, Q, so, T, F) be the automaton in which Q is the 
set of nonterminals together with the additional nonterminal f, sọ = S. The set 
F is defined by F(a, A) = B if and only if A > aB is in P. The state B € T 
if B > à. 


Example 4.14 Let T = (N, X, S, P) be the grammar defined by N = 
{S, A, B, C}, & = {a, b, c}, and P be the set of productions 


S —> aA A—>aA S —> bB B —> bB 
A —> cC C —> cC B —>aA C>ì. 


The corresponding automaton is 


Example 4.15 Let T =(N,T,S,P) be the grammar defined by N = 
{S, A, B, C}, T = {a, b, c}, and P be the set of productions 


S —> aA A —> bB S —> bB B > cC A —> aC 
C —> cA B > aA Cox Boi. 


The corresponding automaton is 
Oe 
b 
Xe 
eee © 
Les 


y 


— 
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More formally, given a regular grammar F = (N, T, S, P) in linear regular 
grammar form, we define a nondeterministic automaton Mr which accepts 
this linear grammar. Given = (N, T, S, P), let M = (È, Q, so, T, F) be 
the nondeterministic automaton in which © = T and N is the set of non- 
terminals together with an additional nonterminal t, so = S. The set Y is 
defined by B € Y(a, A) if A> aB is in P and t € Y(a, A) if A > a is in 
P. The state B € T if B —> à or B = t. Hence (A, a) (B, à) if and only if 
A — aB. Thusif A > aB => abC inT then (A, ab) F (B, b) F (C, A) in Mr. 
More generally, A =* wB if and only if (A, w) F* (B, A) for nonterminals A 
and B. 


Theorem 4.3 The language L, accepted by My is equal to the language L» 
generated by T. 


Proof We first show Lz C Lı. Let w = va be generated by T. If A >* 
vaB => va, then (A, va) F* (B, A) and since the last production is B > i, B 
is an acceptance state. If A >* vB > va, then (A, v) F* (B, A) so (A, va) F* 
(B, a) since the last production is B > a, (B, a) F (t, à), and t is an acceptance 
state. Therefore w is accepted by Mr. 

To show Lı C L3 let w = va be accepted by Mr, then if (A, va) F* (B, à) 
and B is an acceptance state with B > A then A >* vaB => va. If (A, va) F* 
(B,a)F (t, à), then (A, v) F* (B, A) so A =* vB and since (B, a) (f, A), 
B —> a, so A =>* vB > va. Either way, w is generated by I. 














Exercises 


(1) Using the grammar in Example 4.9, construct a parse tree for abbb. 

(2) Using the grammar in Example 4.10, construct a parse tree for aaabbb. 

(3) Using the grammar in Example 4.11, construct a parse tree for babaab. 

(4) In Example 4.4, derive the statement “The cowboy rode slowly into the 
sunset” and construct the correspondence parse tree. 

(5) Find the language generated by the grammar I = (N, T, S, P) defined 
by N = {S, A, B}, T = {a, b} and the set of productions P given by 


S —> AB A->aA A> id B —> Bb B—>ì. 


(6) Find the language generated by the grammar I = (N, T, S, P) defined 
by N = {S, A, B}, T = {a, b} and the set of productions P given by 


S — aB B — bA A—>aB B= b. 


4.1 Formal grammars 135 


(7) Find the language generated by the grammar I = (N, T, S, P) defined 
by N = {S, A, B}, T = {a, b} and the set of productions P given by 


S — aA B —>aA S — bB A—>aB 
B —> bB A—> bA Bb A->da. 


(8) Find the language generated by the grammar F = (N, T, S, P) defined 
by N = {S, A, B, C}, T = {a, b} and the set of productions P given by 


S>C AaB C—>bC B —> bB 
C—>aA B => aA A —> bA Boi. 


(9) Find the grammar which generates the language ww” where w is a string 
of as and bs and w” is the reverse string. For example, abba, abaaba, and 
abbbba belong to ww’. 

(10) Construct a grammar which generates the language wcw” where w € 
{a, b} and w” is the reverse string. 

(11) Construct a grammar which generates the language L = {w : where w € 
{a, b} and w = w’}. 

(12) Construct a grammar which generates the language L described by the 
expression aa*bb*. 

(13) Construct a grammar which generates the language L described by the 
expression (abc)*. 

(14) Construct a grammar which generates the language L described by the 
expression (ab)* v (ac)*. 

(15) Construct a grammar which generates the language L described by the 
expression ac(be)*d. 

(16) Construct a grammar which generates the language expressed by 
(a*ba*ba*b)*. 

(17) Construct a grammar which generates the language expressed by 
(a* (ba)* bb*a)*. 

(18) Construct a grammar which generates the language expressed by 
(a*b) v (b*a)*. 

(19) Construct a grammar which generates the language expressed by 
aa*bb*aa*. 

(20) Construct a grammar which generates the language expressed by 
(a*b) v (c*b) v (ac)*. 

(21) Construct a grammar which generates the language expressed by 
(avb)* (aa v bb) (a v b)*. 

(22) Construct a grammar which generates the language expressed by 
((aa*b) v bb*a)ac*. 
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(23) Construct a grammar to generate arithmetic expressions for positive inte- 
gers less than ten in prefix notation. 

(24) Find an automaton which accepts the language generated by the grammar 
Tr = (N,T, S, P) defined by N = {S, A, B}, T = {a, b} and the set of 
productions P given by 


S—>aB BobA A —aB B >b. 


(25) Find an automaton which accepts the language generated by the grammar 
Tr = (N,T, S, P) defined by N = {S, A, B}, T = {a, b} and the set of 
productions P given by 

S—> aA B > aA S —> bB A— aB 
B —> bB A—> bA Bob A> a. 

(26) Find an automaton which accepts the language generated by the grammar 
Tr =(N,T, S, P) defined by N = {S, A, B, C}, T = {a, b} and the set of 
productions P given by 

S3C A—aB C —> bC B —> bB 
C—aA B —> aA A—> bA Bi. 

(27) Find an automaton which accepts the language generated by the grammar 
Tr = (N,T, S, P) defined by N = {S, A, B, C}, T = {a, b} and the set of 
productions P given by 

S>C C—>b C- aA A—>aA 
C —> aC A>a C>a A>. 
(28) Find an automaton which accepts the language generated by the grammar 


Tr =(N,T, S, P) defined by N = {S, A, B, C}, T = {a, b} and the set of 
productions P given by 


S>C C > aaC C > abC 
C — baC C => bbC C—>d. 


(29) Find an automaton which accepts the language generated by the grammar 
Tr =(N,T, S, P) defined by N = {S, A, B, C}, T = {a, b} and the set of 
productions P given by 

SoC B — aB C —> bC B — bB C —>aA 
Boa A— bC Bb A > aB. 
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(30) Construct a grammar which generates the language accepted by the 


automaton 
24 OG Oe 


(31) Construct a grammar which generates the language accepted by the 
automaton 


(32) Construct a grammar which generates the language accepted by the 
automaton 


a 


(2 
O 
Eea 


(33) Construct a grammar which generates the language accepted by the 


automaton 
P A a,b 
—_ —_ 
KOORO 


138 Grammars 


-©@=O2© 
@)< 


(34) Construct a grammar which generates the language accepted by the 


automaton 
(35) Construct a grammar which generates the language accepted by the 


automaton 


(36) Construct a grammar which generates the language accepted by the 
automaton 


4.2 Chomsky normal form and Greibach normal form 


Definition 4.8 A context-free grammar T is in Chomsky normal form if each 
of its productions is either of the form 


A— BC 
or 
A->a 
where A, B, and C are nonterminals and a is a terminal. 


Definition 4.9 A context-free grammar I is in Greibach normal form if each 
of its productions is of the form 


A —> aW 


where a is a terminal and W is a possibly empty string of nonterminals. 
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We shall show that every language L, not containing the empty word, which 
is generated by a context-free grammar can be generated by a context-free 
grammar in Chomsky normal form. We shall also show that every language L, 
not containing the empty word, which is generated by a context-free grammar 
can be generated by a context-free grammar in Greibach normal form. 

We shall first show that a language L, not containing the empty word, gener- 
ated by a context-free grammar, can be generated by a context-free grammar in 
Chomsky normal form. We begin with a series of lemmas. The first lemma, 
which demonstrates the flexibility of derivations in context-free languages 
shows us that if we have a derivation UV =* W where U, V, W e (NUT)* 
then U and V may be treated separately. 


Lemma 4.2 Let T =(N,T,S, P) be a grammar and UV =>* W, where 
U,V,W €(NUT)*, be aderivation inT withn steps, then W can be expressed 
as W,W2 where U =* Wi, V =* W, are derivations in T, both containing at 
most n steps. 


Proof The proof of this lemma uses induction on the number of steps in the 
production. Assume there is one step. Then only one nonterminal is replaced 
using a production. Assume it is the production A > wBuw/’. Either A is in the 
string U or is in the string V. Without loss of generality assume it is in U, so 


U = XAY 
and 
U => XwBu'Y =U". 
Further 
UV => XwBu'YV = U'V, 

so letting W, = U’ and W% = V we are done. 

Assume the lemma is true for all derivations with less than k steps. Assume 
UV =* W contains k steps. As above assume the first step is UV > U'V 
where U => U’. Note that U'V =* W uses only k — 1 steps. By induction there 


are derivations U’ >* W,, V =* Wn containing at most k steps. Therefore 
U > U', U! >* Wi, V =* W, are the required derivations. 














One of the results of this lemma is that we can get from UV =* W by the 
derivations 


UV >* WV >* WW: 


where W = W, W: since if Xe = Xgș is a derivation, then so are X,V => Xg V 
and UX, > U Xg. 
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The next lemma shows us that in applying productions, the order in which 
we apply them is not particularly important. In fact we can derive any word 
w in the language generated by I’ by replacing the leftmost nonterminal by a 
production at each step in the derivation. This is called a leftmost derivation 
of w. 


Lemma 4.3 Let w € I(L), the language generated by TY =(N,T,S, P). 
There exists a leftmost derivation of w. 


Proof The proof uses induction of the number of steps n in the derivation. If 
n = 1, the derivation S > w is obviously a leftmost derivation. If n = k > 1, 
and S$ =* w,letS => UV where U, V € (N UT)*. By Lemma 4.2, there exists 
derivations U >* w, and V >* w2, where w = wıw2. Since both of these 
derivations contain less than k steps, there exist leftmost derivations U >* 
w, and V >* w2. Then $ > UV >* wi V =* wy wz? is a leftmost derivation 
of w. 




















The following lemma shows that if the language L generated by a context- 
free grammar T does not contain the empty word A then L can be generated by a 
context-free grammar which does not contain any productions of the form A > 
à, which we shall call a à production. The only purpose of such a production is 
to remove A from the string of symbols. For example if we have productions 
C — abBa Aaaa and A — i, we can then derive C >* abBaaaa. We could 
simply remove A — i and replace it withabBaAaaa — abBaaaa. We would 
have to do this wherever A occurs in a production. For example if we have 
the C —> abAaAba, if A — à is removed, we would have to include C > 
abAaba, C — abaAba, and C —> ababa. 

Suppose we have productions A > aB, B > C, C > aa, C — i. If we 
add the production B — i, we have created a new à production. If we just 
remove C — i, we can no longer derive a. A nonterminal X is called nilpotent 
if X =* i. We solve the problem above by removing all nilpotents and not just 
those directly from A productions. Thus we would also add the production 
A — a above since B is nilpotent. 

In the next lemma it is necessary to be able to determine the nilpotents of a 
grammar. The following algorithm determines the set © of all nilpotents in a 
grammar by examining the productions. 


(1) If A — A is a production, then A € ©. 
(2) If A > A,A2...A, where Aj, Ao,..., A, € O, then A € ©. 
(3) Continue (2) until no new elements are added to ©. 
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Lemma 4.4 LetT be a grammar such that T (L) does not contain the empty 
word. Form a grammar I" by beginning with the productions in T and 


(i) removing all à productions; 

(ii) for each production A > w in T, where w = w,X1wW2X2...WnXn and 
(not necessarily distinct) nilpotents X,, X2,..., Xn in w, let P be the 
power set of {1,2,...,n} and for p € P, form productions A— wp 
where w, is the string w with the {X; : i € p} removed, which produce i 
productions. 


The language generated by T" is equal to the language generated by T. 


Proof Let L be the language generated by [ = (N, T, S, P), and L’ be the 
language generated by I” = (N, T, S, P’). The language L’ C L, since any 
production in I’ which is not in I can be replaced with the original productions 
in I’ used to define it. 

To show L C L’, let w € L. Using induction on the number of steps in the 
derivation, we show that if S =* w using productions in P, then S >* w using 
productions in P’. If n = 1 then S = w is obviously a production in P’ since 
w Æ à. Assume n = k and S =>* w is a derivation in I containing k steps. Let 
S => A,A2A3...A, be the first derivation where A; € N U T. Therefore 


S => A1 A243... Am =>* w is the derivation S >* w. 


By Lemma 4.2, there exist derivations A; =* w; in [, for 1 < i < m, where 
W = W1W2...Wm and each derivation has less than k steps. By induction, if 
w; Æ A there exists derivations A; =>* w; in IT” for 1 < i < m (note that if A; 
is a terminal then A; = w; and A; =* w; has 0 steps). Let Ai = A; if w; # À 
and A; = à if w; = à. Then S > A| ASA}... Al, is a derivation in T” and 


S => AAA}... A), > “w AA} nn Al, ŽW WA, o o A), > *wiw2... Wm 











is a derivation in T”. 





For future reference, we point out that if one follows the proof above care- 
fully, one finds that if r (L) had contained the empty word, the only difference 
between T (L) and I'(L’) is that r(L) would have contained the empty word, 
while T (L’) does not. The nonterminal S is a nilpotent that does not get removed, 
however if S — A is a production, the production is removed. 

We now start making progress toward Chomsky normal form. We show 
that given a grammar we can determine another grammar which generates the 
same language and has no productions of the form A —> B, where A, B € N. 
These productions are called trivial productions since they simply relabel 
nonterminals. The process is simple. If A > B and B —> W, where W e (T U 


142 Grammars 


N)*, we remove A — B and include A —> W. More generally, if A; —> A2 > 
A3 > ---— A» =* B, where each A; — Aj+ is a trivial production and 
B — w, then remove the trivial productions and include A; > w. 


Lemma 4.5 /f1(L), the language generated by T = (N,T,S, P), does not 
contain the empty word, then there exists a grammar T” with no à productions 
and no trivial productions such that T(L) = T(L’). 


Proof First assume I has had the A productions removed as shown in the pre- 
vious theorem. Create I’ by removing all of the trivial projections and wherever 
A, > A2 > A3 > --- > Am =>* B occurs, where each A; — Aj+ is a triv- 
ial production and B — w, then remove the trivial productions and include 
Al —> B. 

By construction, [(L’) C T'(L). 

Conversely, assume S =* w occurs in I’. We use induction on the number 
of trivial derivations to show that there is a derivation S >* w in T”. Obviously 
if there is no trivial production then the derivation is in I’. Assume there are 
k trivial productions in the derivation. Assume that the derivation is a leftmost 
derivation of w. Assume S =* w has the form 





S >* VLA V > wi Al V > wi AV > Ww, A3V2 > +++ > Wi Am V2 
=>* wiw Vz 
> www 





where Aj —> Az —> A3 > --- > A,, is the last sequence of trivial productions 
in the derivation, and A,, — w’. Then there are derivations V; >* w1, V2 >* 
w inl’, and 


S >* VAV =>* widi V > wiw Vz =>* wiw w 


has less trivial productions and all productions are in F U T’. Hence by the 
induction hypothesis there is a derivation S >* w in T”. 














Lemma 4.6 IfT(L), the language generated by T = (N,T, S, P), does not 
contain the empty word, then there exists a grammar T' = (N,T, S, P’) in 
which every production either has the form A —> A1 A243... Am forn > 2 
where A, Aj, A2, A3,..., Am are nonterminals or A — a where A is a non- 
terminal and a is a terminal such that T(L) = T (L’). 


Proof Assume all à productions and all trivial productions have been elim- 
inated. Thus all productions are of the form A — Aj, A2, A3,..., Am where 
m > 2 and A; € NUT or A —> a where A is a nonterminal and a is a ter- 
minal. If A1, A2, A3,..., Am all are nonterminals then the production has the 
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proper form. If not, for each A; = a; where a; is a terminal, form a new nonter- 
minal X,,. Replace A > Aj, A2, A3,..., Am With Aj, Aj, Ay,..., Al, where 
A; = A; if A; is a nonterminal and Ai = Xa, if A; is a terminal. Thus if we 
have Viaı V2a2 V3a3 . . . VndnVn+1 Where V; € N* and a; is a terminal, replace 
it with Vı Xa, V2Xa,V3Xa,--- Vn Xa, Vn41 and add productions Xa, — a; for 
1 <i <n. Let I’ =(N,T,S, P’) be the new grammar formed. We need to 
show that '(L’) = F(L). Clearly r(L) € T'(L’) since 


A> Vi dy V2d2 V33 . . . Vian Via 


n 
in T can be replaced by 
A => VXa,V2X a, V3X az ~~» Va Xa, Vn+1 


=> Via, V2Xq,V3Xa, --- Va Xa, Val 
=> Via Vaz V3 X az ras ViX a, Va+1 


=> Via Va: Vaz... Vnan Vaş1 
in T”. 
Conversely assume that in the derivation 
S >* www 
where U =* w, and A =* w and V; = v; are productions in T 


S >* UAU’ > UViXa V2 X a, ane ViXa, Vial V 
= W1 V141 V2A42V343 . . . UnAn Vn+1 W2 = W1 WW, 
where 
A> Vi Xa, V2 Xa, er ViXa, Va+1 
is a production which is in T” and not in I and the derivation is 
S >* UAV 
=>* w,AV 
=> wiV Xa, V2aXa --- Vn Xa, Vng V 
=>* wv Xa, V2Xa--- Va Xa, Vang V 
=> w viai VXa, PE Va Xa, Vi4iV 
=>* W112 Xa, Ee ViXa, Vaz V 
=> W1V1 4102072 V3 X az - -© Va Xa, VngiV 





=> W1V141V242V343 . . . UnAn Vn+1 V 
=>* W1U1 41 V2d2U303... UnGnUn+1V 
=>* W1{WU?2. 


144 Grammars 


This may be replaced by 
S =>* U Via Vaz rra Vian Vay V 
=>* wı Via Vaz . . . Vnan VnsiV 
=>* wiv Vaz... Vaan Vn iV 


* 











=* W1V14102402 . . . An Vn4iV 
=>* W1V141V2402 . . . AnVn41 V 
=>* W1V141 V242 . . . AnVn+1 W2 
=>* www. 











We have a derivation for S >* w,;ww2 in F. Hence I’ CT. 





From the above lemmas we are now able to prove that a context-free gram- 
mar I’ whose language does not contain the empty word can be expressed in 
Chomsky normal form. 


Lemma 4.7 IfT(L), the language generated by T = (N,T,S, P), does not 
contain the empty word, then there exists a grammar in which every production 
has either the form 


A —> BC 
or 
A>a 
where A, B, and C are nonterminals anda is a terminal such that T (L) = T (L^). 


Proof By the previous lemma, in which every production has either the form 
A —> A1A24A3... Am where A, Aj, Az, A3,..., Am are nonterminals or A > 
a where A is a nonterminal and a is a terminal. We construct a new grammar 
by replacing every production of the form A —> A,A2A3...Am by the set 
of productions A > A,X1, Xı > A2X2,..., Xm—2 > Am—1Am, where each 
replacement of a production in I uses a new set of symbols. 


A > AX > A, A2X3 >* A, A2A3...Am 


is a derivation in T’, r(L) C T(L’). 

Conversely, if $ =* w in T” contains no productions which are notin T, then 
w € T (L). Ifit does, let W,, be the last term in the derivation containing a symbol 
in I’ which is not in I so we have Wy > Wm+1 >* w and Wy > Wm+1 
has the form U'Xm—-2V = UAm—1AmV. Therefore the derivation uses the set 
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of productions A —> A,X, Xı > A2Xo2,..., Xm—-2 > Am-1Am and has the 
form 


S >* UAV* >* UA|X\V =>* UA XiV 
=>* UA) AX V =>* UA‘) A, X2V 


SEUA ALA3X3V = UA‘ AA} X3 V 


>* UA- Am—2Xm—-2V =>* UA‘ ees Al, >Xm—-2V 
=> Witt =>* w 





where U’ = U A} AA} --- Ai,_, and A; =* A; is a derivation in I. If this 
derivation S =* w is not in I’, we again pick the last term in the derivation 
containing a symbol in I’ which is not in I’, and continue the process until no 
such terms are left. Therefore w € F and I'(L’) € T'(L). 














Finally we remove the restriction that T (L) contains the empty word. As 
mentioned, following the proof of Lemma 4.4, by eliminating A productions, if 
T(L) contained the empty word, one produced the same language with only the 
empty word eliminated. Since all of the languages of the forms of grammars 
developed since Lemma 4.4 are the same, if r (L) contained the empty word, 
the language developed by the grammar T” in the previous lemma would have 
differed from I (L) only in the fact that T (L) contained the empty word while 
T(L’) did not. Thus to get r(L) we need only have productions that add the 
empty word to the language and leave the rest of the language alone. We do 
this by adding two new nonterminal symbols, S’ and y, where S’ is the new 
start symbol and productions S’ > Sy and y — i. Call this the à extended 
Chomsky normal form. 


Theorem 4.4 Given a context-free grammar I containing the empty word, 
there is a context-free grammar T” in à extended Chomsky normal form so that 
r(L) =T(L’). 


We now consider converting a context-free language to Greibach normal 
form. Even though we use leftmost derivations, we have no bound on how 
many derivations may occur before the first terminal symbol appears at the left 
of the string. For example, using the production A — Aa, we can generate the 
string Aa” for arbitrary n, using n derivations without beginning a string with 
a terminal symbol. We can eliminate this particular problem by eliminating the 
productions of the form A — Aa. This is called elimination of left recursion. 
In a grammar T with no A productions or trivial productions, let 


A— AV,,A— AV2,...,A —> AV, 
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be productions in which the right-hand side of the production begins with an A 
and 


A>U,,A—> Uo,...,A > Um 


be productions in which the right-hand side of the production does not begin 
with an A. We form a new grammar I” by adding a new nonterminal A’ to the 
‘grammar and using the following steps: 


(1) Eliminate all productions of the form A —> AV; for 1 <i <n. 
(2) Form productions A — U;A’ for 1 <i <n. 
(3) Form productions A’ > V;A’ and A’ > Vj. 


Lemma 4.8 T(L)=T'(L). 


Proof Let a derivation beginning with A have the form assuming A —> U; A’ 
for 1 <i <n and we have A > AVo => A Vo Vo =>* AV ae Vo Vao) => 
Ui Væ ER Vo Vao) where Vii) € {V,, Vo,..., Vi} for all 1 < j< k and Ui E€ 
{U,,U2, ..., Um}. Therefore using leftmost derivation, any production contain- 
ing A will have the form 
wAW > wAVa W > wA Vo ViayW = wA Væ) were Vo Vay W 
= WU Vay --- Yay VayW. 


But 
A => AVay = AVayVay >* AVe -Va Vay > Un Væ -Va Vay 
can be replaced by 


A => UnA > UpnVnd’ = Uo Va Ve-nA' 
=>* Ui Væ TO Vo A’ = Ui Væ ens Vo Vay. 
Placing w on the left and W on the right of each term, we have, wAW >* 


wu Væ - -- Voa Vay W in T”. Hence F(Z) € T'(L). 
The proof that I’(L) C T(L) is left to the reader. 














Lemma 4.9 Let A —> UBV be a production in T and B > W, B > 
W2, ..., B —> Wn be the set of all productions in T with B on the left. Let 
T” be the grammar with production A —> U BV removed and the productions 
A —> UW;V for1 <i < m added, then T (L) = T'(L). 


Proof The production A —> U W;V can always be replaced by the production 
A — UBV followed by the production B > W;. Hence I’’(L) € P(L). The 
proof that r(L) € T’’(L) is left to the reader. 
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The theorem that every context-free grammar can be expressed in Greibach 
normal form can be proved by first expressing the grammar in Chomsky normal 
form. We shall not do so however so that the development for Chomsky normal 
form may be omitted if desired. Using the above lemmas, we are about to take 
a giant leap toward proving that every context-free grammar can be expressed 
in Greibach normal form. 


Lemma 4.10 Any context-free grammar which does not generate à can be 
expressed so that each of its productions is of the form 


A— aw, 


where a is a terminal and W is a string which is empty or consists of a string 
of terminals and/or nonterminals. 


Proof We first order the nonterminals beginning with S, the start symbol. 
For simplicity, let the nonterminals be A1, A2, A3,..., Am. Our first goal is to 
change every production so that it is either in the form 


A> aw, 


where a is a terminal and W is a string which is empty or consists of a string 
of terminals and/or nonterminals, or in the form 


Aj rF AY, 


where i < j and Y consists of a string of terminals and/or nonterminals. Recall 
that using the procedures for elimination of left recursion and for eliminating 
a nonterminal described in Lemma 4.9 to alter the productions of the grammar 
does not change the language generated by the grammar. 

Using induction, for i = 1, since § = A, is automatically less than every 
other nonterminal, we need only consider S — SY, the S on the right-hand side 
of the production can be removed by the process of elimination of left recur- 
sion. Assume it is true for every A; where i < k. We now prove the statement 
for i = k. In each case where Ag — A;Y is a production for k > j, use the 
procedure in Lemma 4.9 to eliminate A j. When Ag — A,Y isa production, use 
the process of elimination of left recursion to remove A, from the right-hand 
side. 

Therefore by induction we have every production so that it is either in the 
form 


A> aw, 
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where a is a terminal and W is a string which is empty or consists of a string 
of terminals and/or nonterminals, or in the form 


Aj > A;Y, 


where į < j and Y consists of a string of terminals and/or nonterminals. 

Any production with A,, on the left-hand side must have the form Amn — aW 
since there is no nonterminal larger than A,,. If there is a production of the 
form Am—1 > AmW’', use the procedures in Lemma 4.9 to eliminate Am. The 
result is a production of the form Am —> bW”. Assume k is the largest value 
so Ay, — A;Y is a production where k < j. Again using the procedures in 
Lemma 4.9 to eliminate A j, we have a procedure of the form Ag —> aW. When 
the process is completed, we have 


Aj > aW 


where a is a terminal and W is a string which is empty or consists of a string 
of terminals and/or nonterminals for every i. We now have to consider the B; 
created using a process of elimination of left recursion. From the construction 
of the B;, it is impossible to have a production of the form B; > B; W. There- 
fore productions with B; on the left have the form B; > aW or B; > A; W. 
Repeating the process above we can change these to the form B; —> aW, and 
the lemma is proved. 














Theorem 4.5 Every context-free grammar whose language does not contain 
à can be expressed in Greibach normal form. 


Proof We outline the proof. The details are left to the reader. Since we already 
know that every production can be written in the form 


A> aw, 


where a is a terminal and W is a string which is empty or consists of a string 
of terminals and/or nonterminals, for every terminal b in W replace it with 
nonterminal A; and add the production A, — b. Hint: see proof of Lemma 4.6. 














For any context free grammarcontaining the empty word we can form a 
grammar in extended Greibach normal form, which accepts the empty word 
by simply adding the production S— A after the completion of the Greibach 
normal form. 
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Exercises 


(1) In Lemma 4.9 “Let A —> U BV be a production in T and B > W,, B > 
W2,..., B — Wm be the set of all productions in I with B on the left. 
Let I’ be the grammar with production A —> U BV removed and the pro- 
ductions A > UW;,V for 1 <i <m added, then [(L) = I''(L),” prove 
r(L) CT(L). 

(2) Prove Theorem 4.5 “Every context-free grammar can be expressed in 
Greibach normal form.” 

(3) Complete the proof of Lemma 4.8. 

(4) Let T” = (N, Ł, S, P) be the grammar defined by N = {S, A, B}, © = 
{a, b}, and P be the set of productions 


S— ABABABA A > Aa A>d B >b. 


Express this grammar in Chomsky normal form. 

(5) Express the previous grammar in Greibach normal form. 

(6) Let T = (N, T, S, P) be the grammar with N = {S}, T = {a, b}, and P 
contain the productions 


S— SS B —> aa S— BS B — bb S — SB 
A— ab S>i A— ba S— ASA. 


Express this grammar in Chomsky normal form. 

(7) Express the previous grammar in Greibach normal form. 

(8) Let T” = (N, £, S, P) be the grammar defined by N = {S, A, B}, È = 
{a, b}, and P be the set of productions 


S— AbaB A —> bAa A—>ì B —> AAb BaabA. 


Express this grammar in Chomsky normal form. 
(9) Express the previous grammar in Greibach normal form. 


4.3 Pushdown automata and context-free languages 


The primary importance of the PDA is that a language is accepted by a PDA 
if and only if it is constructed by context-free grammar. Recall that a context- 
free language is a language that is generated by a context-free grammar. In the 
remainder of this section, we show that a language is context-free if and only 
if it is accepted by a PDA. 

We first demonstrate how to construct a PDA that will read the language 
generated by a context-free grammar. 


150 Grammars 


Before beginning we need two tools. The first is the concept of pushing a 
string of stack symbols into the stack. We are not changing the definition of the 
stack. To push a string into the stack we simply mean that we are pushing the last 
symbol of the string into the stack, then the next to last symbol into the stack, and 
continuing until the first symbol of the string has been pushed into the stack. If 
the string is then popped a symbol at a time, the symbols form the original string. 
For example to push ab Ac, we first push c, then push A, then push b, and finally 
push a. Thus we may consider a PDA to have the form M = {£}, Q, s, I, Y, F) 
where » is the alphabet, Q is the set of states, s is the initial or starting state, Z is 
the set of stack symbols, F is the set of acceptance states, and Y is the transition 
relation. The relation Y is a finite subset of ((Q x X* x I*) x (Q x I*)). This 
means that the machine in some state q € Q can read a possibly empty string 
of the alphabet by reading a letter at a time if the string is nonempty, pop and 
read a possibly empty string of symbols by popping and reading a symbol at a 
time if the string of symbols in nonempty and as a result can read a string of 
letters, change state, and push a string of symbols onto the stack as described 
above. 

Throughout the remainder of this section we shall assume that only left 
derivations for context-free languages are used and that the PDA has only two 
states, s and t. The alphabet & in the PDA consist of the terminal symbols 
of the grammar I’. The stack symbols of the PDA consist of the terminal and 
nonterminal symbols of the grammar I’, i.e. J = TUN. 

To convert a context-free grammar I’ into a PDA, which accepts the same 
language generated by I’, we use the following rules: 


(1) Begin by pushing S, start symbol of the grammar, i.e. begin with the 
automaton. 


Y 
Push 
ØS 


v 

















(2) If a nonterminal A is popped from the stack, then for some production 
A — w inT, w is pushed into the stack, i.e. we have the automaton in 
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figure 





(3) If a terminal a is popped from the stack then a must be read, i.e. if we have 
the automaton 


O- 


then we have the automaton 





OD 


Thus the terminal elements at the top of the stack are removed and matched 
with the letters from the tape. 


The result is that we are imitating the productions in the grammar by popping 
the first part of the production and pushing the second part so that while the 
grammar replaces the first part of the production with the second part, so does 
the PDA. The stack then resembles the strings derived in the grammar except 
that the terminals on the left of the derived string (top of the stack) are then 
removed as they occur in the stack and compared with the letters on the tape. 
As before a word is accepted if the word has been read and the stack is empty. 


Example 4.17 Let T =(N,T,S, P) be the grammar with N = {S}, T = 
{a, b}, and P contain the productions 


S — aSa S —> bSb S>iA 
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which generates the language {ww* : w € T*}. This has the PDA 




















aSa 


a accept 


= 


Consider the word abba. We can trace its path with the following table: 






































state instruction stack tape state instruction stack tape 
Xo start À abba t pop b Sba bba 
t push S S abba t read b Sba ba 

t pop S À abba t pop S ba ba 

t push a Sa aSa abba t pop b a ba 

t pop a Sa abba t read b a a 

t read a Sa bba t pop a a Xr 

t pop S a bba t read a À À 

t push bSb bSba bba t accept À À 








Example 4.18 Let rT = (N,T, S, P) be the grammar with N = {S}, T = 
{a, b}, and P contain the productions 


S> SS B> aa S—> BS B- bb S —> SB 
A— ab S> ix A— ba S > ASA 


which generates the language {w : w € A* and contains an even number of as 
and an even number of bs.}. This has the PDA 
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An 










































































































































































push push 
push push t S 
S A 5 A s push | | accept 

tA 

push 
push push read read push push 
y 2 } E a b y’ y b 
push push push push 
vb ya vo va 

















accept 





Consider the word abbabb. We can trace its path with the following table. In 
this table to save space, strings will be pushed in as one operation rather than 
pushing in each symbol. 











instruction stack tape instruction stack tape 
start Xr abbabb pop AS babb 
push S S abbabb pop S babb 
pop À abbabb push ba baS babb 
push SS SS abbabb pop as babb 
pop S abbabb read a S bb 
push ASA ASAS abbabb pop À bb 
pop SAS abbabb push bb bb bb 
push ab abSAS abbabb pop b bb 
pop bSAS abbabb read b b b 
read a bSAS bbabb pop À b 
pop SAS bbabb read b À À 
read b SAS babb 








Before formally proving that a language I (L) is context-free if and only 
if it is accepted by a PDA, we shall adopt a notation for PDAs which will be 
more convenient. We shall denote by an ordered triple the current condition of 
the PDA. This triple consists of the current state of the machine, the remaining 
string of input symbols to be read, and the current string in the stack. For 
example the triple (s, aabb, Aa Ba B) represents the PDA in state s, with aabb 
on the input tape, and Aa BaB in the stack. Given triples (s, u, V) and (t, v, W), 
then notation (s, u, V) } (t, v, W) indicates that the PDA can be changed from 
(s, u, V) to (t, v, W) in a single transition from F. For example the transition 
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((a, s, B), (t, 4)) when the PDA is in condition (ab, s, Baa B) gives notation 
(ab, s, BaaB) | (b, t, aaB) and the transition ((a, s, A), (t, D)) when the PDA 
is in condition (ab, s, Baa B) gives notation (ab, s, BaaB)| (b, t, DBaaB). 
We say that (s, u, V) F* (t, v, W) if the PDA can be changed from (s, u, V) to 
(t, v, W) ina finite number of transitions. 


Lemma 4.11 Let T(L) be a context-free language. There exists a PDA that 
accepts L 


M = (£, Q,s,I, T, F) 


where & is a finite alphabet, Q is a finite set of states, s is the initial or starting 
state, I is a finite of stack symbols, Y is the transition relation, and F is the set 
of acceptance states. The relation Y is a subset of 


(Œ> x Q x I*)x (Q x I’)). 


Proof As previously mentioned, we shall assume that the PDA has two 
states which we shall denote here as s and ż so that M = {£}, Q, s, I, Y, F) 
where &, the alphabet, consists of the terminal symbols T of the grammar 
r= (N,T, S, P), Q = {s, t}, s is the initial or starting state, J consists of the 
terminal and nonterminal symbols of the grammar, i.e. I = T U N, the set of 
stack symbols, T = {s, t}, and Y is the transition relation defined as follows: 


(1) ((s, à, A), (t, S)) € Y so (s, u, à) F (t, u, S) for u € T*. Begin by pushing 
S, start symbol of the grammar, i.e. The automaton begins with 


Y 
Push 


ØS 


v 

















(2) If A > w in T, then (t, à, A), (t, w)) € Y so (t,u, A) F (t, u, w) for u € 
T*. (If a nonterminal A is popped from the stack, then for some production 
A — w in T, w is pushed into the stack, i.e. we have in the automaton 
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(3) For all a € T, ((t, a, a), (t, A)) € Y so (t, au, aw) F (t, u, w) for u € T*. 
If a terminal a is popped from the stack then a must be read, i.e. if we have 


in the automaton 


We shall assume that I is in Chomsky normal form. The reader is asked to 
prove the theorem when T is in Greibach normal form. 

Using left derivation, every string that is derived either begins with a terminal 
or contains no terminal. In the corresponding PDA, the derived string is placed 
in the stack using (2). If there is a terminal, it is compared with the next letter to 
be read as input. If they agree, then the terminal is removed from the stack using 
(3). If the word generated by the terminal is the same as the word generated 
by grammar, each terminal will be removed from the stack as it is generated 
leaving only an empty stack after the tape has been read. 

We first show, assuming leftmost derivation in I’, that if S =* a6 where 
a € T* and £ begins with a nonterminal or is empty, then (t, œ, S) F* (t, A, B). 
Hence if S >* a in T where a € 7%, then (s, œ, A) (t, a, S) F* (t, A, A), 
and g is accepted by the PDA. We prove this using induction on the length of 
the derivation. Suppose n = 0, but then we have S$ >* S, so œ = À, B=S, 
and (t, à, S) F* (t, à, S) gives us (t, æ, S) F* (t, A, B). Now assume S >* y 
in k + 1 steps. Say 


it must be followed by 





S > m, > m >*Ä* m > me. 





Then there is a first nonterminal B in the string m, and a produc- 
tion B —> w so my =uBv and mg}; = uwv. By the induction hypothe- 
sis, since S >* uBv, (t, u, S) }* (t, à, Bv). Since B > w using relation 
(2), we have (t,à, B), (t,w))€ Y and (t, à, Bu) F (t, A, wv). If the pro- 
duction B —> w has the form B —> CD, where C and D are nontermi- 
nals, so that w = CD, then w begins with a nonterminal and (t, u, S) H* 
(t, à, Bv) F (t, à, wv)or (t, u, S) F* (t, à, wv) where mg, = uwv as desired. 
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If the production has the form B — a so that w = a, where a is a terminal, 
then (t, ua, S) F* (t, a, Bv) F (t,a, av) F (t, à, v) using derivation (3) above. 
Hence (t, uw, S) F* (t, A, v) where mz; = uwv as desired. Note that since 
we are using leftmost derivation, v must begin with a nonterminal. 

We now show that if (t, a, S) F* (t, à, B) witha € T*, B € I*, then S >* 
aß. Hence if (s, æ, A) F (t, a, S) F* (t, à, à) so @ is accepted by the PDA, then 
S =* a so a is generated by the grammar. 

We again use induction on the length of the computation by the PDA. If 
k = 0, then (t, à, S) F* (t, A, S) so S =* S, which is certainly true. Assume 
(t,a, S)F* (t, A, B) ink + 1 steps so that (t, a, S) F* (t, v, y) in k steps and 
(t,v, vy) E(t, A, B). If the transition relation for (t, w, y) F (t, A, 6) is relation 
(2), then y = Bv, B = wv and B — w. Since no input is read, we have w = À. 
Therefore by induction, S >* ay = æ Bv. Since B > w,aBv > awv = aß. 
Therefore S =* wf. If the transition relation for (t, v, y) F (t, A, B) is type 
(3), then v = a and y = af for some terminal a. But since a is the last input 
read from the string œ, a = au for u € T*. Hence (t, u, S) F* (t, A, y) and by 
induction S$ >* uy = uaB = aß. 














For a given pushdown automaton M = (x, Q, s, I, T, F), we next wish to 
construct a context-free grammar F = (N, T, S, P). The expression N shall be 
of the form (p, B, q), where p and q are states of the automaton and B is in the 
stack. Thus (p, B, q) represents the input u read in passing from state p to state 
q, where B is removed from the stack. In fact we shall have (p, B, q) >* u. 
The terminal (p, A, q) represents the input read in passing from state p to state 
q and leaving the stack as it was in state p. The productions consist of the 
following four types. 


(1) For each q € T, the production S —> (s, à, q). 

(2) For each transition ((p,a, B), (q, D))€ Y, where B, D € IU {A}, the 
productions (p, B, t) > a(q, D, t) forallt € Q. 

(3) For each transition ((p,a, D), (q, BıB2... Ba) € Y, where D €e IU 
{A}, Bi, Bo,..., B, € C, the productions 


(p, D, t) > a(q, By, q1)(q1, B2, q2) (q2, B3, q3) - . - (dn-1, Bn, t) 


for all q1, q2,- .., qn-1,t € Q. 
(4) For each q € Q, the production (q, à, q) > à. 


The first statement intuitively says that at the beginning we need to generate 
the entire word accepted by the PDA. The second statement intuitively says that 
the output generated by (p, B, t), which is the input to be read by the PDA in 
state p using stack B moving to state f, is equal on the right-hand side of the 


4.3 Pushdown automata and context-free languages 157 


production to the input read in passing from state p with stack A to state g with 
stack D followed by the output generated by (q, D, t) which is the input read 
by the PDA in state q with stack D moving to state t. 

The third statement intuitively says that the output generated by (p, D, t), 
which is the input to be read by the PDA in state p with stack D moving to 
state t, is equal on the right-hand side of the production to the input read in 
passing from state p with stack D to state q with stack B, B2... B, followed 
by the output generated by (q, Bı, q1), the input read by passing from state q 
using stack B to state qı ... followed by the output generated by (q1, B2, q2), 
the input read by passing from state qı using stack B, to state q2 ... followed 
by the output generated by (qg,_1, Bn, t), the input read by passing from state 
dn—1 using stack B, to state t. 

The fourth statement intuitively says that to move from a state to itself 
requires no input. Note that the productions of type (4) are the only ones 
which pop nonterminals without replacing them with other nonterminals. Hence 
a word in the language of the grammar cannot be generated without these 
productions. 


Lemma 4.12 A language M(L) accepted by a pushdown automaton M = 
(2, Q,s,1, T, F), is a context-free language. 


Proof Using the grammarT = (N, X, S, P), where the nonterminals and pro- 
ductions are described above, we show that I’ generates the same language as 
accepted by M. 

We first show that for p,q € Q, B € I U {A} and w € A*, that 


(p, B, q) =* w if and only if (p, w, B) F* (q, à, À). 


Thus for t € Q, (s, à, t) =* w if and only if (s, w, à) F* (t, à, à) so that a 
word is generated by I if and only if it is accepted by M. 

First, using induction on the number of derivation steps, we show that if 
(p, B, q) =* w then (p, w, B) F* (q, à, A). Beginning with n = 1, the only 
possibility is that a nonterminal is popped, without replacement. This can only 
occur using productions of type (4), so we have p = q, B = À, and w = À. 
But this gives us (p, à, à) F* (p, à, à) which is obvious. Assume n = k > 1, 
then the first production can only be of type (2) or type (3). If it is type (2), 
we have (p, B, q) — a(r, D, q) for p,r € Q, where ((p,a, B), (r, D) € Y. 
Hence letting w = av, (p, w, B) F- (q, v, D)andby induction if (r, D, q) >* v 
then (q, v, D) F* (q, à, A). Therefore (p, w, B) F* (q, à, à). 

If the first production is of type (3), we have 


(p, B, q) = a(qo, Bi, q1) (q1, B2, q2) (q2, B3, q3) - - . (qn-1, Bn, q) >* w 
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and ((p,a, B),(g, BiBo...B,))€ Y. So if w=av, (p,w, B)F (q,v, 
Bı B2... Bn). For convenience of notation, let q =qn. Let (g;-1, Bi, qi) =>* ui 
so that w = auju2...Uy, and v = uuz ... un. By induction, (qi—1, ui, Bi) K* 
(qi, à, à). 

Therefore we have 


(p, w, B) F (qo, u1u2 ... Un, By By... Bn) 
H* (q1, u2.. Un, B2... Bn) 
Ee (q2,U3...Un, B3... Bn) 


H (qn-1, Un, Bn) 
H (Gn, A, A) 


so that (p, w, B) F* (q, à, A). 

We now show that if (p, w, B) K* (q, à, A) then (p, B, q) =* w. We use 
induction on the number of steps in (p, w, B) F* (q, A, A). If there are 0 steps, 
then p = q and w = B =. This corresponds to (p, A, p) => à which is one 
of the productions. Therefore the statement is true for O steps. 

Assume (p, w, B) F* (q, à, à) in k + 1 steps. First assume that we have 
w = av and 


(p, w, B)F (q, v, D) F* (q, à, A) 


where ((p, a, B), (r, D)) € Y, and B, D € I U {A}, giving productions (p, B, 
q) > a(r,D,q). Since (q,v,D)* (qg,4,A) by induction hypothesis, 
(r, D,q) =* v. Therefore (p, B,g) > a(r, D,q) >* av =w and we are 
finished. 

Next assume w = av and the first step is (p, w, B) F (q, v, Bı B2... Bn) so 
we have 


(p, w, B)} (qo, v, Bi By... Bn) F* (q, A, A) 


and each B; is eventually removed from the stack in order so that there are states 
15 q2, - - -qn—1, qn Where qn = q and v = viv... UVn—1Vn Such that 


(p, w, B) F (qo, 102... Un—1Un, By Bz... Bn) 
H* (q1, v2... Up—1Un, B2... Bn) 
H* (q, U3, ..- Un—1Un, B3 yai Bn) 


ps (Gn-1; Un, Bn) 
Bs (Gn, À, 2). 
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By the induction hypothesis, (g;-1, Bi, qi) >* vi. 
But since the production is type (3), 
(p, B,q) = a(qo, Bi, 91) (q1, B2, 92) (42, Bs, 93) ---(Gn—1, Bn, q) 
=* avı (qi, B2, q2)(q2, B3, q3) --- (Gn—1, Bn, q) 
=* aviv2(q2, B3, 93) ---(dn—1, Bn, q) 


=>" avv... Vn—1(qn-1, Bn, q) 


=>* AVIV... Vn—1Vn 





so that (p, B, q) >* w. 











Theorem 4.6 A language is context-free if and only if it is accepted by a PDA. 


Exercises 


(1) Construct a pushdown automaton which reads the same language as the 
grammarlT = (N, £, S, P) defined by N = {S, A, B}, & = {a, b, c}, and 
the set of productions P given by 


S—> aA A—>aAB A->a Bob Bi. 


(2) Construct a pushdown automaton which reads the same language as 
generated by the grammar I’ = (N, X, S, P) defined by N = {S, A, B}, 
x = {a, b, c}, and the set of productions P given by 


S —> AB A —> abaA A—>àÀ B —> Bcacc Boi. 


(3) Construct a pushdown automaton which reads the same language as 
generated by the grammar I’ = (N, X, S, P) defined by N = {S, A, B}, 
x = {a, b, c, }, and the set of productions P given by 


S — AcB A— abaA A> i B —> Bcacb B—>ì. 


(4) Construct a pushdown automaton which reads the same language as 
generated by the grammar I’ = (N, &, S, P) defined by N = {S, A, B}, 
x = {a, b, c}, and the set of productions P given by 


S— AB A — acA B — bcB B —> bB 
A —> aAa Boi A->d. 


160 Grammars 


(5) Construct a pushdown automaton which reads the same language as 
generated by the grammar I’ = (N, &, S, P) defined by N = {S, A, B}, 
x = {a, b, c, d}, and the set of productions P given by 


S —> AB A—> aAc B —> bBc B —> bB 
A —> AaA Boi A>. 


(6) Construct a grammar which generates the language read by the pushdown 
automaton 



































= 


[ accept [Sd 


(7) Construct a grammar which generates the language read by the pushdown 
automaton 
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(8) Construct a grammar which generates the language read by the pushdown 
automaton 


da 


a 


>? 


b 


a 





Y 








eo 


(9) Construct a grammar which generates the language read by the pushdown 
automaton 



































accept 





(10) Prove Theorem 4.6 “A language is context-free if and only if it is accepted 
by a PDA.” Assume the grammar is in Greibach normal form. 

(11) Construct a pushdown automaton that reads the same language as the 
grammar I = (N, £, S, P) defined by N = {S, B} UX, È = {a, b, c}, 
and the set of productions P given by 


S—> aA A — aAB A->a B —b B> ìà. 
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(12) Construct a pushdown automaton that reads the same language as the 
grammarl = (N, £, S, P)defined by N = {S, A, B}, & = {a, b, c}, and 
the set of productions P given by 


S— AB A —> abaA A> B —> Bcacc Bo. 


(13) Construct a pushdown automaton that reads the same language as the 
grammarT = (N, T, S, P) defined by N = {S, A, B}, & = {a, b, c}, and 
the set of productions P given by 


S —> AdB A —> abaA A—>ì B —> Bcacb Bi. 


(14) Construct a pushdown automaton that reads the same language as the 
grammarT = (N, T, S, P) defined by N = {S, A, B}, Y = {a, b, c}, and 
the set of productions P given by 


S —> AB A — acA B — bcB B —> bB 
A —> aAa Bod A->d. 


(15) Construct a pushdown automaton that reads the same language as the 
grammar I = (N, Y, S, P) defined by N = {S, A, B}, Y = {a, b, c, d}, 
and the set of productions P given by 


S —> AB A— aAc B —> bBc B —> bB 
A —> AaA Boi A->d. 


4.4 The Pumping Lemma and decidability 


Just as we were able to show that there are languages that are not regular 
languages, we are also able to show that there are languages that are not context- 
free. We begin by returning to the concept of the parse tree or derivation tree. 
The height of the tree is the length of the longest path in the tree. The level of 
a vertex A in the tree is the length of the path from the vertex S to the vertex A. 
A tree is a binary tree if each vertex has at most two children. Note that if the 
grammar is in Chomsky normal form then every tree formed is a binary tree. 


Lemma 4.13 If A >* w where A is a nonterminal and the height of the 
corresponding derivation tree with root A is n, then the length of w is less than 
or equal to 2". 


Proof We use induction on the height of the derivation tree. If n = 1, then the 
derivation has the form A —> a, and the length of w = a is 1 = 2°. Assume the 
lemma is true when n = k, and let A >* w have a derivation tree of height k + 
1. Then A > BC >* uv = w, where B >* u and C =* v and the derivation 
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tree for both of these derivations has height n. Therefore both u and v have length 


less than or equal to 2*7! and w has length less than or equal to 2 - 245! = 2* = 
QkED=1, 














Previously we had a pumping theorem for regular languages. We now have 
one for context-free languages. 


Theorem 4.7 (Pumping Lemma) Let L be a context-free language. There 
exists an integer M so that any word longer than M in L has the form xuvwy 
where uw is not the empty word, the word uvw has length less than or equal 
to M, and xu"wv"y € L foralln > 0. 


Proof Let L — {A} be a nonempty language generated by the grammar 
T = (N, &, S, P) in Chomsky normal form with p productions. Let M = 2?. 
Assume there is a word w in L with length greater than or equal to M. Then 
by the previous theorem, the derivation tree has height greater than p. There- 
fore there is a path S —> --- — a where a is a letter in the derivation tree with 
length greater than p and a is a letter of w. Since there are only p productions, 
some nonterminal occurs more than once on the left-hand side of a production. 
Let C be the first nonterminal to occur the second time. Therefore we have a 
derivation 


S >* aCB >* xuCvy >* xuwvy 








where a >* x, p >* y, C >* uCv and C = w. But using these derivations, 
we can form the derivation 


S => xuCvy >* xuuCvvy >* xu"wv'"y for any positive integer n. 


Since the first production in the derivation has the form C > AB >* uCv, 
and there are no empty words, either u or v is not the empty word. Pick a letter 
a in uwv; we can work our way back to S using one occurrence of each of the 
productions C =* uCv and C = w. Hence the length of the path is at most p, 
and the length of uwv is less than or equal to M. 














We are now able to find a language which is not a context-free language. 


Corollary 4.1 The language L = {a"b"a" : m > 1} is not a context-free 
language. 


Proof Assume m is large enough so that the length of aba” is larger than 
M. Therefore aba” = puqvr where pu"qv"r € L foralln > 1. If either u 
or v contains both a and b, for example assume u = a‘b/, then (a'b/)" must be 
a substring of pu”qv”r which is clearly impossible. Thus u and v each consist 
entirely of strings of as or entirely of strings of bs. They cannot both be strings 
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of as or strings of bs since the number of occurrences of the common letter in 
these strings could continue as n increases but the number of occurrences of the 
other letter would not increase. Therefore u must be a string of as to begin each 
word ab” a™ and v must be a string of as to end each word a” b™a™ which is 
a contradiction. 














Example 4.19 The language L = {a'b/c'd/} : i, j > 1}is nota context-free 
language. 


Proof Let M and xu” wv”y be the same as in the previous theorem. Therefore 
|uwv| < m. Consider abcd” € L. Since |uwv| < m it must be contained 
in a power of one of the letters a, b, c, or d or in the powers of two adjacent 
letters. If it is contained in the power of one letter, say a, then there are fewer 
as than cs in each word in M, which is a contradiction. If it is contained in a 
power of two adjacent letters, say a and b, then there are fewer as than cs in 











each word in M, which is a contradiction. 





Example 4.20 The language L = {ww : w € {a, b}*} is not context-free. We 
shall see in Theorem 4.10 that the intersection of a regular language and a 
context-free language is context-free. But L N a*b*a*b* ={a'b/a‘b/} is not 
context-free. The argument is the same as in the previous example. 


Example 4.21 The language L = {x : the length of x is a prime} is not 
context-free. Since primes are arbitrarily large some element w with length m of 
L must have the form xu" wv" y forn > 2. Letm = |xwy|. Then |uv| = n — w 
and |xu™ wv” y| =m + m(n — w) which is not a prime. 


We have previously seen that the set of regular languages is closed under 
the operations concatenation, union, Kleene star, intersection and complement. 
We now explore these same operations for the set of context-free languages. 


Theorem 4.8 The set of context-free languages is closed under the operations 
of concatenation, union, and Kleene star. 


Proof Let Lı and L, be generated by grammars I; = (Nj, X1, S1, P1) and 
T2 = (Np, X2, S2, P2) respectively. Assume N; and N, are disjoint. This can 
always be accomplished by relabeling the elements in either N; or Np. 

The language L,L> can be generated by the grammar T = (N, X, S, P) 
where N = Nı UN, U{S}, 5 = 2S] U Xo, and P = Pi U P U{S > Sı So}. If 
u € Lı and v € Ly, then Sı >* u, in T1, S2 >* v in T3, and using leftmost 
derivation, we have S > SS. >* uS, >* uvinT. 

The language Lı U L2 can be generated by the grammar I = (N, &, S, P) 
where N = Ni U M: U{S}, © = X; U Yo, and P = Pi U P U{S > Si, S > 
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S>}.Letw € Lı U L2. Therefore w € Lj orw e Lo. Ifw e Li, then Sı >* win 
Tı, and S > Sı >* winT.Ifw e€ Lo, then S2 >* winT,,andS > Sp >* w 
inr. 

The language Lý can be generated by the grammar T = (N, £, S, P) 
where N = Ni U {S}, 2 = Yyand P = Pi UP, U{S > SiS, S > à}. Let 
W1, W2, W3,..., Wn E Lı. Using productions S > S,S and S —> à, we can 
form the derivation S =* SŤ = S1SıSı--- S1. Using leftmost derivations 
we can derive S =* S)S,S,--- S$, =>* wi s1S1 -e S1 >* w,w2S)--- S =>* 
wwz: : w, in T. Hence Lj = L, the language generated by I’. 














Theorem 4.9 The set of context-free languages is not closed under the oper- 
ations of intersection and complement. 


Proof The sets {a”b"c™ : m,n > 0} and {a”b"c™ : m,n > 0} are context- 
free. The first is generated by the grammar with productions 


P = {S — BC, B —> aBb, B > àÀ,C > cC, C > À}. 
The second is generated by the grammar with productions 
P = {S — AB, A > aA, A > à, B > bBc, B > ì}. 


However, the intersection is the language L = {a"b"a™" : m > 0}, which we 
have shown is not context-free. 

If the set of context-free languages is closed under complement then since 
Li A L2 = (L4 U LS), the set of context-free languages is closed under inter- 
section which we have already shown is not true. 














Although the intersection of context-free languages is not necessarily a 
context-free language, the intersection of a context-free language and a reg- 
ular language is a context-free language. The proof is somewhat similar to the 
one showing that the union of languages accepted by an automaton is accepted 
by an automaton. 


Theorem 4.10 The intersection of a regular language and a context-free 
language is context-free. 


Proof Let the pushdown automaton M = (£, Q, s, I, Y, F) where & is the 
alphabet, Q is the set of states, s is the initial or starting state, J is the set of stack 
symbols, F is the set of acceptance states, and Y is the transition relation where 
the relation Y is a finite subset of ((Q x X* x I*) x (Q x I*)). Let the deter- 
ministic finite automaton Mı = (£1, Q1, go, Yi, Fi) where X; is the alphabet, 
Q, is the set of states, go is the initial or starting state, F; is the set of acceptance 
states, and Y; is the transition function. We now define the pushdown automaton 
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M2 = (%2, Qo, 52, h, T, F2) where s2 = (s, go), X2 = X1 U È, h = I, Q2 = 
Q x Qj, and Fy = F x Fi. Define Yz by (((s;, gj), a, X), (Sms qn), b)) € V2 
if and only if ((s;, a, X), F (Sm, b)) in M and Yı(q;j, u) = qn in Mı. A word is 
w accepted in Mz if and only if ((s, qo), w, A) F3 (Csa, qb), A) in M2, where Sa 
and q, are acceptance states in M and M, respectively. Thus w is accepted by 
the pushdown automaton M and also accepted by M4. 

To show M2(L) = M(L) A M,(L), the reader is asked to first show that 
((s, qo), W, A) F* (Sm, qn), A, œ) if and only if (s, w, à) H* (Sm, à, œ) and 
(qo, w) F} (qn, A), using induction on the number of operations in F3. The 
theorem immediately follows. 














Definition 4.10 A nonterminal in a context-free grammar is useless if it does 
not occur in any derivation S >* w, for w € X*. Ifanonterminal is not useless, 
then it is useful. 


Theorem 4.11 Given a context-free grammar, it is possible to find and remove 
all productions with useless nonterminals. 


Proof We first remove any nonterminal U so that there is no derivation U 
=>* w , for w € &*. To find such nonterminals, let X be defined as follows: 
(1) For each nonterminal V such that V —> w is a production for w € &%*, let 
V e X. (D) If V > Vi V2- -- V, where V; € X or &* forl <i <n,let V €X. 
Continue step (2) until no new nonterminals are added to X. Any nonterminal 
U not in X has no derivation U =* w. If S is not in X, then the language 
generated by the context-free grammar is empty and we are done. There are no 
useful nonterminals. Assume S is in X. All productions containing nonterminals 
not in X are removed from the set of productions P. 

Assume such productions have been removed. We now have to remove any 
nonterminal U which is not reachable by S, i.e. there is no production S >* W 
where U is in the string W. To test each nonterminal U we form a set Yy as 
follows: (1) If V — W and U is in the string W, then V € Yy. (2) if R >T, 
where an element of Yy is in the string T, then R € Yy. Continue step (2) 
until no new nonterminals are added to Yy. If S € Yy, then U is reachable 
by S. If not, U is not reachable by S. Remove all productions which contain 
nonterminals not reachable by S. The context-free grammar created contains 
no useless nonterminals. 














Theorem 4.12 It is possible to determine whether a context-free language L 
is empty. 


Proof A context-free language L is empty if and only if it contains no useful 
nonterminals. 
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Theorem 4.13 Given w € &*, and a context-free grammar G, it is possible 
to determine whether w is in the language generated by G. 


Proof Assume that G is in Chomsky normal form. Let w be a word of length 
n > 1. Each step of the derived string becomes longer except when the nonter- 
minal is replaced by a terminal. Therefore the derivation S$ =* w has length k 
< 2"! since G is in Chomsky normal form. Therefore check all derivations of 
length k < 2”7!, 














The above proof shows that it is possible to determine whether a word is in 
the language generated by a grammar; it really is not a practical way. 


Theorem 4.14 Let G be a context-free grammar in Chomsky normal form 
with exactly p productions. The language L(G) is infinite if and only if there 
exists a word œ in L(G) such that 2? < |w| < 2P+!, 


Proof If there is a word with length greater than 2” then by the proof of the 
Pumping Lemma, L(G) is infinite. Conversely, let w be the shortest word with 
length greater than 2?+!. By the Pumping Lemma, w = xu’ wv’ y, where the 
length of www < 2? and u = xui™!wvi™!y is in L(G). But |u| > |æ] — |uv| > 
2P. Also |u| < |@| and w is the shortest word with length greater than or equal 
to 2P+1, Therefore |u|. < 2?t!. 














Theorem 4.15 Jt is possible to determine whether a language generated by 
a context-free grammar is finite or infinite. 


Proof Since it is possible to determine whether a word is in the language of 
a context-free grammar, simply try all words with length between 2? and 2?+! 
to see if one of them is in the context-free grammar. If one is, the grammar is 
infinite. If not the grammar is finite. 














Exercises 


(1) Let grammar T = (N, Y, S, P) be defined by N = {S, A, B}, Y= 
{a, b, c}, and the set of productions P given by 


S— AB A —> acA B — bcB B —> bB 
A —> aBa Boi A->a. 


Let L be the language generated by T. Find the grammar that generates L*. 
(2) Let Lı be the language generated by the grammar I) = (N, X, S, P) 
defined by N = {S, A, B}, & = {a,b,c}, and the set of productions P 
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given by 
S— aA A—>aAB A->a Bob B=), 


and L2 be the language generated by the grammar T2 = (N, X, S, P) 
defined by N = {S, A, B}, & = {a,b,c}, and the set of productions P 
given by 


S —> AB A —> abaA A> id B— Bcacc B=. 


Find the grammar that generates Lı L2. 

(3) Let Lı be the language generated by the grammar I’; = (N, Y, S, P) 
defined by N = {S, A, B}, & = {a, b, c}, and the set of productions P 
given by 


S — AdB A —> abaA A> B — Bcacb B > ì, 


and L2 be the language generated by the grammar T2 = (N, Y, S, P) 
defined by N = {S, A, B}, Y = {a,b,c}, and the set of productions P 
given by 


S — AB A — acA B —> bcB B —> bB 
A —> aAa Boi A->d. 


Find the grammar that generates Lı U Lp. 

Determine whether the following languages are context-free. If the lan- 
guage is context-free, construct a grammar that generates it. If it is not 
context-free, prove that it is not. 

(4) L={a™b"c* : m,n =1,2,...}. 

(5) L={ww*w : w e€ {a, b}*}. 

(6) L = {w € {a, b, c}*} : w has an equal number of as and bs }. 
(7) L={a"b**c" : n= 1,2,...}. 

(8) Prove the induction step in Theorem 4.10. 


5 


Turing machines 


5.1 Deterministic Turing machines 


The Turing machine is certainly the most powerful of the machines that we 
have considered and, in a sense, is the most powerful machine that we can 
consider. It is believed that every well-defined algorithm that people can be 
taught to perform or that can be performed by any computer can be performed 
on a Turing machine. This is essentially the statement made by Alonzo Church 
in 1936 and is known as Church’s Thesis. This is not a theorem. It has not been 
mathematically proven. However, no one has found any reason for doubting it. 

It is interesting that although the computer, as we know it, had not yet been 
invented when the Turing machine was created, the Turing machine contains 
the theory on which computers are based. Many students have been amazed to 
find that, using a Turing machine, they are actually writing computer programs. 
Thus computer programs preceded the computer. 

We warn the reader in advance that if they look at different books on Turing 
machines, they will find the descriptions to be quite different. One author will 
state a certain property to be required of their machine. Another author will 
strictly prohibit the same property on their machine. Nevertheless, the machines, 
although different, have the same capabilities. 

The Turing machine has an input alphabet £, a set of tape symbols, T 
containing &, and a set of states Q, similar to the automaton. The Turing 
machine has two special states, the start state sọ and the halt state h. When the 
machine reaches the halt state it shuts down. It also has a tape which is infinitely 
long on the right. If made of paper it can wipe out a forest. 

The tape contains squares on which letters of the alphabet and other symbols 
can be written or erased. Only a finite number of the squares may contain tape 
symbols. All of the squares to the right of the last square containing a tape sym- 
bol are considered to be blank. Some of the squares between or in front of letters 
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may also be blank. In addition to the tape there is a head which can read any tape 
symbol which is on the square of the tape at which the head is pointing. It also 
can be in different states just like an automaton and a pushdown automaton. Also 
like the automaton, the input can be a letter of the alphabet which can be read 
from the tape together with the current state of the machine. Depending on the 
input and its current state, the machine, in addition to changing states, can print 
a different symbol on the square of the tape in front of it or erase the letter in the 
square. In addition or instead, the head can move left or right from the square it 
has just read to the next square. The blank can be both read and printed by the 
Turing machine, but is not considered an element of the tape symbols. As input, 
reading a blank is simply reading the absence of any of the tape symbols. Printing 
a blank is considered to be erasing the symbol currently in that square. We use # 
for blank. The Turing machine shown below is in state sı and is reading letter a. 





[b[alalb]# I/F]... 


S 
h a 
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More formally we have the following definition. 
Definition 5.1 A deterministic Turing machine is a quintuple 

(Q, u,T, ô, so, h) 
where Q is the set of states, T is a finite set of tape symbols, which includes the 
alphabet and #, So is the starting state, h is the halt state, and ô is a function 
from Q xT to Q xT x N where N consists of L which indicates a movement 


on the tape one position to the left, R which indicates a movement on the tape 
one position to the right, and # which indicates that no movement takes place. 


Just like any computer, a Turing machine has a program or set of rules which 
tell the machine what to do. An example of a rule is 


ô(s1, a) = (52, b, L) 
which we shall denote as 
(s1, a, S2, b, L). 


This rules says that if the machine is in state sı and reads the letter a, it is to 
change to state s2, print the letter b in place of the letter a and move one square 
to the left. The rule 


(s1, a, s2, #, R) 
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says that if the machine is in state sı and reads the letter a, it changes to state 
s2, erases the a and moves one square to the right. The rule 


(s1, #, h, #, #) 


says that if the machine is in state sı and reads a blank then it halts, and thus 
does not print anything or move the position on the tape. For consistency we 
shall always require that the machine begins in the leftmost square. 

It may appear that since ô is a function, the deterministic Turing machine will 
either continue forever or reach the halt state. However, if the Turing machine 
is reading the leftmost square on the tape and gets the command to move left, it 
obviously cannot do so. In such a case we say that the system crashes. Often a 
special symbol is placed in the first box to warn the machine that it is reaching 
the end of the tape. 

Obviously there is a difference between the machine stopping because it 
crashes and stopping when it reaches the halt state. In the second case the 
machine has completed its program. 

It is obvious that our rules allow us both to print a letter and move the position 
on the tape to the left or to the right. Some definitions allow a machine either 
to print a letter or to move the head, but not both. Thus it requires two separate 
rules to print a letter and move the position on the tape. 

We shall begin with a program that simply moves the position of the machine 
on the tape from the beginning to the end of a string. The alphabet is & = {a, b} 
and symbols I’ = {a, b, #}. We shall have the set of states Q = {59, s1, A} and 
the set of rules 


(So, a, 51, a, R) (so, b, 51, b, R) (s1, a, s1,a, R) 
(s1, b, 81, b, R) (s1, #, h, #, #, ). 


This program leaves everything alone. It simply reads each letter and then moves 
right to the next square. When it reaches a blank, it shuts down. However, this 
program does do something which we shall later need. It moves the position on 
the tape from the beginning of the word to the end of the word. Instead of having 
it reach a blank and shut down, we will put it at the beginning of another program 
where we want the position of the machine to be at the end of the word. Hence 
we shall call this program go-end. As we demonstrate this program, it would 
be rather tiresome to continually draw the Turing machine so rather than draw 








alblalb[b| #)... 























172 Turing machines 


which shows the position of the machine at the second square of the tape and 
in state sı, while the first and third squares of the tape contain an a, the second, 
fourth and fifth squares contain a b and the other squares are blank; we replace 
this with 


1 
a babb 


where the line below the b denotes the location of the head, and the 1 above the 
b denotes the current state of the machine. We shall call this the configuration 
of the Turing machine. 

As we begin our program the machine has configuration 


0 
aba b b. 


We then apply rule 
(so, a, 81,4, R) 


moving the head to the right and changing from state sọ to state sı and our 
machine then has configuration 


We then apply rule 
(sı, b, 81, b, R) 
moving the head to the right again and our machine then has configuration 


1 


We then apply rule 
(s1, a, 81, a, R) 


moving the head to the right again and our machine then has configuration 


1 
a b a b b. 


We again apply rule 
(s1,a, 51,4, R) 
moving the head to the right again and our machine then has configuration 


1 
a babb. 
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We apply the same rule again and have 


We then use rule 
(s1, #, h, #, #, ) 


and the machine shuts down. 

We mentioned previously that if the position on the tape is on the leftmost 
position on the tape and gets an instruction to move left, we say that the machine 
crashes, and the machine ceases functioning. 

We shall now construct a rather unusual program. This program causes the 
machine to crash. We shall again let the input alphabet F be the set {a, b, #}. We 
shall also assume we have states Q = {5o, 5,,..., 8;,...}. It shall have the rules 


(sj, a, 8;,a, L) (sj, b, sj, b, L) (sj, #, sj, #, L). 


If we have a larger alphabet, we simply add more rules, so that regardless of 
what the machine reads when it is in state sj, it continues to go left until it 
crashes. This program does not begin at sọ because we want to include it in 
other programs when we want to crash the system. We shall call this program 
go-crash. State s; is the “suicide” state. When we want to crash the system 
we simply instruct it to go to state sj. 

It seems pretty silly to think of either go-end or go-crash as complete pro- 
grams. We really want to use them inside other programs. We shall refer to 
these types of program as subroutines. 

The reason for the go-crash program is really theoretical. If 5 is a partial 
function instead of a function then the Turing machine is still deterministic in the 
sense that for every input for which there is a rule, there is a unique output. If for 
every input, there is a unique output, then the set of rules would define a function. 
If the rules do not define a function then there is a state s and an input letter a 
for which there is no rule. When this happens, we say that the system hangs, 
since it cannot go on. We shall again meet this problem with nondeterministic 
automata. Suppose we would like the set of rules to define a function, but we 
still want the program to stop when it is in state s and reads a. The system cannot 
hang since the function is defined for every input. We can however add a rule 


(s,a,s;,a, L) 


which puts the system into the suicide state and causes it to crash using 
go-crash. Thus the system crashes instead of hanging and we have expanded 
our rules so that we have a function. In this discussion, we will state only 
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relevant rules with the understanding that we could produce a function using 
go-crash if we really wanted to do so. 

Itis perhaps time we considered something a bit more practical for the Turing 
machine. We begin by showing some of its properties as a text editor. Our first 
step is not exactly a giant one. We show how to move the position on the tape 
to the right n steps. Again we assume both the input and output alphabet are 
the set {a, b}. If we have a larger alphabet, we simply add appropriate rules for 
each new letter. The set of states Q = {5),...,5j,...,5n,Sn41}. We shall call 
this new subroutine go-right(7). It has the following rules: 


(sı, a, 82,4, R) (S2,4,53,a,R) (83, a, 84,4, R) +++ (Sn, 4, Sn41, 4, R) 
(s1,b,52,b, R) (s2, b, 53, b, R) (s3, b,54,b, R) +--+ (Sn, b, Sn41, b, R). 


It is easily seen that if we begin in state sı, each application of a rule, regardless 
of the letter read, moves the the position on the tape one step to the right and 
increases the state. After n steps the head has been moved to the right by n 
squares and we are in state 5,41. It is hoped that, with little effort, the reader 
can create a subroutine for moving to the left by n squares. 

Suppose that after moving left or right by n squares, or without moving at 
all we want to change the letter in the current square occupied from a to b. 
Assuming that we are in state s; at the time then we simply use the rule 


(si a, Si, b, #). 
Moving along, suppose that 
È = {a1, dz, 3, .. . , An, b1, bo, b3, ..., bn}, 
T = È U {#} and we want to replace 
GAZ... Ai, Aj41- +. Aj, +. An 
with 
Gida sedi bri Dj, Ajay... An. 


We first use go-right(i) to move to the proper position so the head is on aj+1. 
Assume we are in state s’, We then use the rules 

(s’, Gi+1> Sis bişi, R) 

(Si; Gi42, 85, bi+2, R) 

(85, 4143, S3, bi+3, R) 

(s3, di+4, Shs Di+a, R) 








(6i i aj, Si js bj, R) 


to replace the letters and use go-left( j) to return to the original spot. 
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The next text edit feature which we shall illustrate is to insert a letter in a 
string. We shall find this feature very handy in the near future. We shall call this 
subroutine insert(c). Say that we have a string 


41402 ` + GjGj41°** An—-14n 
and we want to replace it with 
4142: `` AjCAj+1*** An—14n 


so that the string aj+1---d,—1@, must be moved one square to the right and 
c placed in the square formerly occupied by a;+1. We shall assume that the 
string contains no blanks. If it does, a special symbol will have to be used to 
denote the end of the string. Actually this is not quite the order in which we 
shall proceed. First, for simplicity, assume that the input and output alphabets 
are the same and that © = {a, b, c} and T {a, b, c, #}. We shall assume that we 
know the position on the tape in which the c is to be placed (i.e. we know i) and 
that we know the length of the string (i.e. we know n). First we use go-right(i) 
to place the head where the letter c is to be placed. Assume that we are in state 
Sx When we reach this square. We are going to need a state for each letter in the 
alphabet. Thus we shall need s4, sp, and se. The process is really rather simple. 
When we print c, we need to remember a; ; so that we can print it in the next 
square. We do this by entering s,,,, after we have printed c and then moving 
right. In state sy,,,, we print a;+1 in the square occupied by a;+2 and then enter 
state sy,,, and again move right. Each time we print a letter, we enter the state 
corresponding to the letter destroyed and in this way “remember” this letter 
so it can be printed in the next square. Remember in state sq,,,, we print a; +; 
regardless of the letter read. Finally, when we reach a blank square, we print 
an and then use go-left(n) to return to the beginning of the string. Also it is 
possible that c occurs elsewhere in the string; however, we shall assume that 
dj+1 is not already c. Thus our rules for actually printing c and moving over the 
other letters are 


(Sy, A, Sa, €C, R) (Sb, C, Sc, b, R) (Sx, b, Sb, c, R) (Sc, 4, Sa, C, R) 
(Sc, b, Sp, c, R) (Sa, b, Sp, a, R) (Sc, C, Sc, C, R) (Sa, C, Sc, a, R) 
(Sb, 4, Sa, b, R) (Sp, #, Sy, b, #) (Sp, b, Sp, b, R) (Se, #, Sy, C, #) 
(Sa, 4, Sa, 4, R) (Sa, #, Sy, a, #) 


and we end up in state sy. 
For example assume we have the word abbac and want to insert c so that 
we have abcbbc. Using go-right(2), we have configuration 
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Applying rule 
(sx, b, Sb, €, R) 


we have configuration 


abecea qe. 


In the future we will condense this statement to 


> 


(sy, b, Sp, c, R) F bs 


|g 
Q 


We then have the following rules and configurations 


a 
(sba, Sab, YF kei 
c 
R)F 
(Sa, C, Se, G, ) a b c b a # 
y 
(Sa, #, sya, #)F oe 


and we now use go-left(5) to return to our original position. 

Suppose we began at the square of the letter we were replacing and wanted 
to return to that square. Instead of placing the c in the square, we would place 
a marker, and then when we had finished moving the letters we would return to 
the marker and replace it with a c. Details are left to the reader. 

The next text edit feature which we shall illustrate is to delete a letter in a 
string and close up the empty square. We shall call this subroutine delete(c). 
Say that we have a string aja -> + diCai+1 +++ An—14n Which contains no blanks 
and we want to replace it with aja? - ++ djdj41 +++ An—14n. If there is a blank, we 
would have to have a special marker to denote the end of the string. There are at 
least two ways of doing this. One way is to move over to the square containing 
c and replace it with a marker which is not part of the regular alphabet. Then 
go to the end of the string and move each letter to the left in a similar manner 
to the one we used to move letters to the right in insert(c), replacing the marker 
with the letter to its right and then changing states to return to the front of the 
word or wherever desired. If the string contains no blanks then a blank can be 
used to denote the end of the string. Otherwise a special symbol will need to 
be used. The details of this subroutine are left to the reader. 

An alternative form is to move to the letter to be deleted, replace it with a 
marker, move to the right to find the next letter and replace the marker with that 
letter. Then go right again to the letter which has been duplicated and replace it 
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with a marker. Continue this process until reaching the end of the string. Let A 
be the special marker. Assume that we have used go-right(i) to reach the letter 
c to be deleted. Again assume that we begin in state s, and want to end in state 
Sy. We shall also let the X = {a, b, c} and I = {a, b, c} U {#, }. We then have 
the following set of rules: 


(Sx, C, Sw, A, R) (sx, b, Sw, A, R) (Sy, 4, Sw, A, R) (Sy, a, Sg, a, L) 
(sw, b, Sb, b, L) (Sw, C, Se,C, L) — (Sa, A, Sx,a, R) (Sp, A, Sx, b, R) 
(Sc, A, Sx, C, R) (Sx, #, Sx, #, L) (Sy, A, Sy, #, #). 


Note that the marker is not actually needed. It is used to make the rules easier 
to read. 

For example, suppose we have the string abcbac and wish to remove the c 
in the third space. We use go-right(2) to get to the desired space and have the 
configuration 


abc b 


a cC. 


We then have the following rules and configurations: 
/ 


Xx 
(xc sx, A, RE PK E 
= (sx, b, sp, b, L) H B 
Sx’, > Sb, ’ 
i # abA bae 
X 
A, Sx, b, R)F 
TAAS UES Sie i S 
x’ 
x b, xw, A, RF 
B S ) abbA ae 
a 
x!» aa? ,L)r 
=> (Sy, 4, Sa, a, L) ds beh Bae 
X 
a A, X9 , R)F 
=> (s Sx, a, R) abb ee 
- a 
x9 , Sw, A, R 
=> (S5,,4, 8 ) abb Re 
Cc 
x!» MCF ,L)r 
EC poe Ae T 
X 
cœ A, x? , R)F 
oe eee) abbace 
- A 
X 9 Sw, A, R 
Ba M ay Obie N a 
x" 
xs #, w, #, L H 
PAA eha A Bh 
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Finally applying rule 
(Sx, A, Sy, #, #) 


we have configuration 


y 
a bbac### 


and using go-left(5), we return to the beginning of the string. 

Finally we show how to use the Turing machine to duplicate a string. For 
simplicity we shall limit the letters in the string to the set {a, b}. If the alphabet 
is increased, similar rules to those given will be added for each letter included. 
We shall need additional symbols Aq, àb, Oa, and op. Briefly the first letter of 
the string is replaced by A, if the letter is a and by A, if the letter is b. We 
then go to the end of the string and place a corresponding oa, or op. We then 
return to the first symbol and replace it with the original letter, go to the second 
letter and repeat the process. We continue until we have a string followed by 
corresponding 0,8, and ops. We then replace each o, with an a and op witha b. 

Assume that we start in state s, and end in state s,. We then have the following 
set of rules: 


(Sx, á, Sa, àa, R) (Sp, #, Sy, Op, L) (Sg, 4, Sq, A, R) (Sy, a, Sy, a, L) 
(Sx, b, Sw, b, L) (Sa, Oa, Sa, Oa, R) (Sw, Oa, Sx, Oa, L) (Sa, Ob, Sa, Op, R) 
(Sx, D, Sb, àb, R) (Sw, Aa, Sx,a, R) (Sp, A, Sp, a, R) (Sy, Ap, Sx, b, R) 
(Sx, Oa, Sx, a, R) (Sp, Oa, Sb, Oa, R) (Sx, Ob, Sx, b, R) — (Sb, Ob, Sb, Ob, R) 
(Sa, Ë, Sx, Oa, L) (Sa, D, Sa, b, R) (Sy, Ob, S, Opb, L) (Sp, b, Sp, b, R) 
(Sx, #, Sy, #, #). 


For example, we shall duplicate the word bab. The initial configuration is 


x 
b a b. 
We then have the following rules and configurations: 
(Sx, b, Sp, àp, R) F y 
x? kd b> bs Ap a b 
=> (55,4, Sp, a, R) F 4 
b> ? b> kd Àp a b 
=> (Sp, b, Sp, b, R) F 4 
b> ’ b> > Àb a b # 
x’ 
#, Sw, Op, L) F 
= (Sp, #, Sx, Op, L) E Ba cies 
x! 
x's b, x's b, L i 
7S È ) àp a b o 
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x’ 
> (Sx, a, Sx’, a, L) E À 


Ay a b o 
=> (sw, Ap, Sb, RE 7 
x's Ab, 9x, b a b Op 
a 
=> (Sx, 4, Sa, Aq, RI F b Àa b Op 
a 
, b, Sa, b, R) F 
> (Sa Sa ) b ha b op 
a 
R) F 
=> (Sa, Ob, Sa, Ob; ) b Xa b Op # 
x’ 
#, Sy LJE 
= (Sa, 3 Sx ’ Oa, ) b Xa b Op On 
x 


, , L)F 
> (Sx ’ Op, Sx ? Op, ) b Àa b Op Oa 


= (Sx, b, Sx’, b, L) H b 


Àa b Op Oa 
x 
> (Sx, Aa, 52,4, R)E b Op Oa 
b 


b àp, RYE 
=> (Sx, b, Sb, Ay, R) oe E 


b 
R)F 

= (Sp, Ob, Sb, Op, R) b a dy oO» On 
b 


=> (Sp, Ob, Sp, Op, R) F 
b a Àp Ob Oa 


R) F 
= (Sp, Oa, Sb, Oa, ) b a Àp Op On # 


> (sp, #, Sx’, Ob, L) H 
b a kb Ob Oa Op 

x’ 
=> (Sx, Og, Sy’, Og, LF 
b a àp Ob Oa Op 


=> (Sx, Ob, Sx, Ob, L) H 
b a hp Ob Oa Op 


=> (Sx, Àb, Sx, b, R) F 
b a b op Oa % 


X 
X9 Sx, b, RYE 
or Sas Sons ) b a b b oa œ% 
X 
X9 Yas 2X9 , R) H 
> Gx» Cas Sx 4, R) bab ba oo 
X 


=> (Sx, Ob, Sx, b, R) F b a b b a b # 
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and applying rule 
(Sx, #, Sy, #, #) 
we have 


y 
b abbab# 


and we are done. 


Definition 5.2 A word w € &* is accepted by a Turing machine T if, begin- 
ning in the start state, there is a way to read w and be in the halt state. The 
language accepted by a Turing machine T is the set of all words accepted by 
T. 


We next show how to use the machine as an acceptor. We begin by showing 
that a Turing machine can recognize a regular language. We already know that 
an automaton recognizes a regular language, so what we shall basically do is 
program it to imitate an automaton. Assume that we have a word in the Turing 
machine which we want the machine to read so that it can determine whether it 
wants to accept it. An automaton reads a word beginning with the first letter and 
reads from left to right until it has reached the last letter. We need our Turing 
machine to do the same. 


SO 
# a a 43 a4 as a6 ay 


and we are ready to begin. 
We have another way of representing a Turing machine which makes it look 
more like an automaton. We shall represent the rule 


(s;, a, Sj, b, R) 
by the symbol 
a 


DaT Sj 


L 


so that the program go-end which has rules 


(So, a, s1,a, R) (so, b, 51, b, R) (s1,a, 51, a, R) 
(s1, b, s1, b, R) (s1, #, h, #, #, #) 
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may be represented by 


KA 
a b 
S 
(a,R) 1 Q R) 
"P9 a 
(b,R) 
h 


Notice that the letter above the arrow is the letter which is read by the machine. 
We really do not care what is printed out. We could print a # in each square as 
it is read or we could simply print back the letter that is read. We shall choose 
to do the latter. Each time a letter is read, we wish the machine to move one 
square to the left, so that the next letter is read. 

We are now ready to imitate an automaton. If the symbol 


occurs in an automaton, we shall imitate it with the rule 
(s;, 4, 8;, a, R) 


or the symbol 


a 


GD 


Sj 


It may be recalled that a word is accepted by an automaton if, after the word is 
read, the automaton is in an acceptance state. For every acceptance state s of 
the automaton, we will add a rule 


(s, #, h, #, #) 
shown as 


# 


H n 


Si 


so that if the word is accepted by the automaton it will also end up in state s of 
the Turing machine, read the # in front of the word and halt. Thus the Turing 
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machine halting will mean that it accepts a word. Further a Turing machine 
programmed in this manner accepts the same words as the automaton it is 
imitating. 

For example, given the automaton 


Q 
Oe 
f ORO 
O 


we have the corresponding program for the Turing machine 


a 


S 


b 
gire BRA 
“OC A civ OTO 
o 


(b,R) 





(4%) 


Since a Turing machine can be programmed to accept the same language as 
a given automaton, we have the following theorem: 


Theorem 5.1 Every regular language is recognized by a Turing machine. 


Definition 5.3 The languages recognized by Turing machines are called 
recursively enumerable. 


We have already shown that regular languages are recursively enumerable 
and claimed that context-free languages are recursively enumerable. At this 
point we shall show how a Turing machine recognizes the language {a”b” : n 
is a positive integer}, which is context-free, and how it recognizes {a"b"c" : n 
is a positive integer}, which is not context-free. 

We begin by designing a program for a Turing machine that will recognize 
the language {a"b” : n is a positive integer}. We basically want the Turing 
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machine to read an a, then read a b, return to read an a, and continue until all 
of the as and bs have been read, if there is an equal number of them. We begin 
by reading an a in the first square. We want to know that we have counted this 
a, so we shall change it to A. We do this with rule 


(so, a, 51, A, R). 


We now want to go right until we reach a b, which we shall change to a B. To 
get to b, we need to pass over each a without changing it and also after the first 
time we may have to pass over Bs without changing them to reach a b. We do 
this with the rules 


(s1, a, 51, 4, R) 
(s1, B, S1, B, R). 


When we reach a b, we want to change it to a B and go back left. We do that 
with the rule 


(s1, b, s2, B, L). 


We now need to go back to find the second a. To do this we go left until we 
reach an A. This will tell us that the next letter to the right should be the next 
a. To go back, we need to pass over Bs, and as to get to A. We do this with the 
rules 


(s2, B, s2, B, L) (s2, a, 82, a, L). 


When we reach A, we want to go one square to the right to read another a, if 
there is one. We do this with the rule 


(s2, A, So, A, R). 


This puts us back into the cycle of reading another a and another b. If we run 
out of bs before we run out of as the system will be in state sı and eventually try 
to read a blank so it will hang. If we have read the last a, then when we reach 
A and go right one square, we will read a B. At this point we need to check to 
see if there is another b. First we change state if we are in sọ and read a B. We 
do this with rule 


(so, B, 53, B, R). 
In state s3, read nothing but Bs and a blank. Thus we have the rules 


(s3, B, s3, B, R) (s3, #, h, #, #). 
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This may also be shown as the labeled graph 


eles ’ 
ag er (BL) © i 
FOGE] 
Ne (AR) j 
L Qr O 


BK 
For example consider the string aabb. The initial configuration is 


0 
aa b b. 


We then have the following rules and configurations: 


1 
(so, a, s1, A, R) F 


A a b b 
R) H 
> (51,4, 51,a, R) A 
Ue ROE. 
5S1, D, $2, D, A a B b 
z4 pe” 
5S2, A, S2, A, A B b 
A A,R)F 
=> (%2, » 50, g ) A a B b 
1 
= (50,4,51,A,R)F 4 ABb 
1 
B B, R) FH 
> (s1, > S1, ’ ) A A B b 
2 
b B,L)F 
=> (sı, » 52, ’ ) A A B B 
2 
B B,L)F 
=> (%2, » 52, ’ ) A A B B 
2 
B B,L)F 
> (52, B, s2, B, L) AABB 
0 
A A,R)F 
> (52, A, So, A, R) A ABB 
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3 
= (so, B, s3, B, RF , ABB 
3 
B B, R)}H 
BeA T p i A 
h 


#, h, #,#) H 
FP e Gs ra Be ge 


Next we design a program for a Turing machine that will recognize the 
language {a"b" : n is a positive integer}. 

We begin by reading an a in the first square. We want to know that we have 
counted this a, so we shall change it to A. We do this with rule 


(so, a, s1, A, R). 
We now want to go right until we reach a b, which we shall change to a B. To 
get to b, we need to pass over each a without changing it and also after the first 


time we may have to pass over Bs without changing them to reach a b. We do 
this with the rules 


(s1, a, S1,a, R) (sı, B, s1, B, R). 
When we reach a b, we want to change it to a B and start back to look for 
another a. We do this with the rule 
(s1, b, s2, B, L). 


To go back, we need to pass over Bs and as to get to A. We do this with the 
rules 


(s2, B, s2, B, L) (s2,a, 82, a, L). 


When we reach A, we want to go one square to the right to read another a, if 
there is one. We do this with the rule 


(s2, A, So, A, R). 
This puts us back into the cycle of reading another a and b. If we run out of bs 
before we run out of as the system will hang. If we have read the last a, then 
when we reach A and go right one square, we will read a B. At this point we 


need to check to see if there is another b. First we change state if we are in sọ 
and read a B. We do this with rule 


(so, B, 53, B, R). 


In state s3, we expect to read nothing but Bs, b, and a blank. Thus we have the 
rules 


(s3, B, 53, B, R) (s3, b, s4, B, R) (s4, #, h, #, #). 
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We now design a program for a Turing machine that will recognize the 
language {a"b"c” : n is a positive integer}. In a manner similar to the previous 
example we want the Turing machine to read an a, then read a b, then read ac, 
and continue until all of the as, bs, and cs have been read, if there are an equal 
number of them. We begin by reading an a in the first square. We want to know 
that we have counted this a, so we shall change it to A. We do this with rule 


(so, 4,51, A, R). 
We now want to go right until we reach a b, which we shall change to a B. To 
get to b, we need to pass over each a without changing it and also after the first 


time we may have to pass over Bs without changing them to reach a b. We do 
this with the rules 


(81, a, 51, a, R) (sı, B, s1, B, R). 


When we reach a b, we want to change it to a B and continue onward. We do 
that with the rule 


(s1, b, 82, B, R). 


We now need to continue until we find a c. We will need to pass over bs and 
Cs. We do this with the rules 


(s2, b, 82, b, R) (s2, C, 82, C, R). 


We next want to read c, replace it with a C, and start back to look for another 
a. We do this with the rule 


(s2, €, 83, C, L). 


To go back, we need to pass over Cs, bs, Bs, and as to get to A. We do this 
with the rules 


(s3, C, s3, C; L) (s3, b, s3, b, L) (s3, B, s3, B, L) (83, a, 83, a, L). 


When we reach A, we want to go one square to the right to read another a, if 
there is one. We do this with the rule 


(s3, A, So, A, R). 


This puts us back into the cycle of reading another a, b, and c. If we run out 
of bs or cs before we run out of as the system will hang. If we have read the 
last a, then when we reach A and go right one square, we will read a B. At this 
point we need to check to see if there is another b. First we change state if we 
are in sọ and read a B. We do this with rule 


(so, B, s4, B, R). 
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In state s4, we expect to read nothing but Bs, Cs, and a blank. Thus we have 
the rules 


(s4, B, s4, B, R) (s4, C, s4, C, R) (s4, #, h, #, #). 


This may also be shown as the labeled directed graph 


tne a 
oe 


Orr R) 


A Geass 

3p 
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For example consider the string aabbcc. The initial configuration is 


0 
aabbee 


We then have the following rules and configurations: 


A,R)F 
(so, a, 51, A, R) A 


abbee 
= ( R) H l 
S1, 4, 851,4, A b Be eke 
2 
b B,R)F 
=> (51, > S2, ’ ) A a B b e c 
2 
b b, R) FH- 
=> (52, 52,0, ) A B b c c 
3 
,€, s3; C; L) FP 
= (52, C, $3 ) A MET 
3 
> (s3, b,s3,b, LD) F , Bob ex 


3 
B B,L)F 
> (53, B, s3, B, L) ia ae 
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=> (53, a, 53,4, L) F 


> & 
a 


B b C c 
> (s3, Å, So, A, R) F 


> (s0, a, S1, A, R)F 


ls & 


A A 


Q 
a 


=> (sı, B, s1, B, R) F 


DS- Sœ 


A A 


by 
IAN’ a 


= (51,5, 52, B, RF , A B 


B 


=> (s2, C, s2, C, R) F A 


In uno 


L)}H 
> (s2, €, 83, C, L) A A 


by 
la’ 
Q 


> (53, C, s3, C, L) F A 


lb w w 


> (53, B, s3, B, L) F A 


[ty S 
es) 
Q 
io) 


> (53, B, s3, B, L) F 


b S a 


A 


len] 
io) 
=) 


> (s3, A, 50, A, R) a A 


> 
jy Cb 


B, s4, B, R)FH 
> (So, B, S4 ) JIA 


w 
ln e w 


B, s4, B, R)F 
> (s4, B, S4 ) A AB 


w 
AFIAFOA 
Q 


IN 


=> (s4, C, s4, C, R) F A 


> (s4, C, s4, C, R) A 


ltt A 


A B BCC 
h 


= (84,9, h tHE a A B BCC #. 


We now show how to perform two arithmetic operations on a Turing machine. 
The first of these is addition, which is trivial. Suppose we have p, represented 


by a string of 1s of length p, and q represented by a string of 1s of length q, so 
we have the configuration 


0 
I) Ved) aee A ae dee de Wo cane le 
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We simply use delete(#) to delete the blank, and we have 


0 
Ed seedy TR A cee 


which is a string of 1s of length p + q, which represents p + q. 

We now sketch a method of multiplying positive integers p and g, which 
are represented by strings of 1s of lengths p and q respectively. Details of the 
multiplication are left to the reader. We begin with the configuration 


0 
Le dose Do OT A ee, Te 


Replace the first 1 with 6. Replace the second 1 with 6 (if there is no second 1, 
move left and delete £, leaving only the string q). Otherwise, move to the end 
of the string of 1s of length q. Place a blank (so that the length of q is retained), 
and place another string of 1s of length q after the blank. Return to £ and then 
go right to the next 1 in the string for p. Replace the 1 with £ and again place a 
string of 1s of length q at the end of the third string. Continue this until there are 
no more Is in the string for p. At that point when the machine tries to read a 1 
from p, it will read a #. Go to end of the third string. Go left deleting all blanks. 
Then continue left until reaching a $. Delete all £s to produce the answer. 


Exercises 


(1) Supply the details for the Turing machine program delete(c). 

(2) Design a Turing machine for the program go-left(n) which moves the head 
of the machine to the left n squares. 

(3) Design a Turing machine for insert(c) which begins at the square of the 
letter we were replacing and returns to that square. (Hint: Instead of placing 
the c in the square, we would place a marker, and then when we had finished 
moving the letters we would return to the marker and replace it with a c.) 

(4) Design a Turing machine for delete(c) where the machine moves over to 
the square containing c and replaces it with a marker which is not part of 
the regular alphabet. It then goes to the end of the string and moves each 
letter to the left in a similar manner to the one we use to move letters to 
the right in insert(c). It replaces the marker with the letter to its right and 
then changes states to remain where the letter was inserted. 

(5) Design a Turing machine that multiplies two positive integers. 

(6) Design a Turing machine that subtracts a smaller number from a larger 
one. 

(7) Design a Turing machine that accepts the language described by the 
expression ab*c*(b V ac). 
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(8) Design a Turing machine that accepts the language described by the 
expression abc(b V ac)*b 

(9) Design a Turing machine that accepts all strings in a and b except aba 
and abb. 

(10) Design a Turing machine that accepts the language described by the 
expression (aa*bb*)*. 

(11) Write the list of rules and configurations to describe the results when the 
program that accepts a”b” for a positive number n tries to read a7b°. 

(12) Write the list of rules and configurations to describe the results when the 
program that accepts a”b” for a positive number n tries to read a?b?. 

(13) Write the list of rules and configurations to describe the results when the 
program which accepts a"b"c" for a positive number n tries to read a*b?c?. 

(14) Design a Turing machine that accepts the language of all strings in a and 
b that have the same number of as and bs. 

(15) For a given string s consisting of as and bs, define reverse(s) to be the 
string s written backwards. Thus reverse(abbb) = bbba. Design a Turing 
machine that, given a string s, prints its reverse. 

(16) A palindrome over the set {a, b, c} is a string such that s = reverse(s). 
Thus abbcbba, abba, abcba, and cbaabc are palindromes. An even palin- 
drome has an even number of letters in the string and an odd palindrome 
has an odd number of letters in the string. Design a Turing machine that 
accepts all even palindromes. 

(17) Design a Turing machine that accepts all odd palindromes. 

(18) Design a Turing machine that accepts all words of the form a”b”a” for 
any positive integer n. 

(19) Design a Turing machine that accepts all words of the form ww where w 
is a string of as and bs. 


5.2 Nondeterministic Turing machines and acceptance of 
context-free languages 


We begin by showing that a context-free language can be accepted by a non- 
deterministic Turing machine and then show that any language accepted by a 
nondeterministic Turing machine is accepted by a deterministic Turing machine. 


Definition 5.4 A Turing machine is not deterministic if ô is a finite subset of 
(OQxT)x(Q@xT x WN). 


Thus ô is replaced by a relation, which we shall denote by 6. Thus 6(s, a) 
is a subset of (Q x F x N). Since a context-free language is accepted by a 
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pushdown automata, we show that a pushdown automaton can be imitated by a 
Turing machine and hence this Turing machine accepts a context-free language. 
The only problem at this point is that a pushdown automaton is not deterministic 
and hence the Turing machine we create is not deterministic. Thus we must show 
that any language accepted by a nondeterministic Turing machine is accepted 
by a deterministic Turing machine. 


Theorem 5.2 A context-free language is accepted by a nondeterministic Tur- 
ing machine. 


Proof We shall prove this theorem informally assuming that the steps in the 
conversion can be easily replaced by subroutines in a Turing machine. When 
at a given step, we have the word w to read and w in the stack of the pushdown 
automata, we shall associate this with wV% on the tape of the Turing machine. 
The word w is accepted if wV# is converted to #V#. Assume the tape begins 
with a blank followed by the word. Assume also that the Turing machine is 
positioned at the first letter of the word. 

For each of the rules for a pushdown automaton, we shall give the corre- 
sponding instructions for a Turing machine. 


1 ((a,s, E),(t, D)) Tn state s, a is read and E is popped, go to state t 
and push D. 

((a, 5,4), (t, D)) In state s, a is read, go to state t and push D. 

(A,s,4),(s,D)) In state s, push D. 

((a,s, E), (t,à)) In state s, and a is read, pop E and go to state t. 

(A, s, E), (s,à)) In state s, pop E. 

((a, s, A), (t, A)) In state s, read a and go to state t. 

((a, s, A), (s, A)) In state s, read a. 


NAYANDN BWW 


1. Go to the first position after V, delete E and insert D. Return to a and delete 
a. Go to state t. 

2. Go to the first position after V, insert D. Return to a and delete a. Go to 
state t. 

3. Go to the first position after V, insert D. Return to the original letter. Go to 
state s. 

4. Go to the first position after V, delete E. Return to a and delete a. Go to 
state t. 

5. Go to the first position after V, delete E£. Return to the original position. Go 
to state s. 

6. Delete a. Go to state t. 

7. Delete a. Go to state s. 
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We now show that a language accepted by a nondeterministic Turing machine 
T is accepted by a deterministic Turing machine T’. Since T is nondeterministic, 
and, O(s, x) is a subset of (Q x T x N), if O(s, x) contains k elements, we 
shall denote them by o(s, x), 01(s, x), 02(s, x)... O—1(s, x). Thus each of the 
6;(s, x) is well defined. For example we could have 6o(s, x) = (s’, b, R), so the 
tule is (s, x, s’, b, R) and 6,(s, x) = (s”, C, L), so the rule is (s, x, s”, C, L). 
We number the elements in each 0(s;, a;) for each state s; and each a; in È. 
Hence if we are in state s and and have input x, and are given an integer j, 
we can use 0;(s, x) to supply the rule to use. Assume that we never need more 
than n + 1 integers to label the subsets for any 0(5;,a;), then if we have a 
sequence of nonnegative integers m1, M2, ..., Mp less than or equal to n, we 
could sequentially apply @n,,Om,,---,@m,» Which together with the state and 
input would give us the rules to use. If we apply all possible relevant sequences, 
we can produce all possible computations. Hence if a word is accepted by the 
Turing machine 7, it will be accepted in one of these computations. 

The next problem is the production of the sequences of integers. We shall 
label these sequences No, N1, No,..., Ni,... We begin with No = 0 and sim- 
ply count in base n + 1. Thus the sequences are 


(0), (1), (2), ..., (n), (1, 0), (1, 1), (1, 2), (1, 3), ..., (1, n), (2, 0), ... 
The sequence following 
(1, 3, 4, 3, 2, 3) 
is 
C1, 3, 4, 3, 2, 4), 
and the sequence following 
(1, 3,4,n, n,n) 
is 
(1, 3, 5, 0, 0, 0). 


The subroutine in which a Turing machine changes the number N% to Ng+1 is 
straightforward and is left to the reader in the problems, with the warning that 
as the length of the sequence is increased, it is increased to the right on the tape. 

We next have to decide how to proceed in reading a given word. We will 
place the word to be read, followed by | and the current sequence on the tape. 
At each step we will mark the letter being read, keeping track of the state the 
machine is in, and then proceed to the right to locate the proper number in 
the sequence. If the machine is in state s, and we mark a, we shall use a, as 
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the marker. Thus we retain information about both the letter read and the current 
state of the machine. As we use a number in the sequence we mark it with a’, so 
that we proceed each time to the first unmarked number in the sequence, select 
it, mark it, and then return to the marked letter with the information needed to 
select the proper path for the Turing machine to take, given the state and the 
letter being read. For example suppose the Turing machine is in state s, reads 
letter a, and finds that j is the number selected, it then proceeds with 6;(s, a) 
to supply the rule to use. 
As an illustration suppose we have 


S4 
a, b -ba a h E 2 2 13 T 2. 


3 = 


ds, b 


1 s2 
We change b to b,, so we have 
baat|V 2 2? 13 1 2, 


ds, bs As, bs, 


We then move to 1, the first unmarked integer and mark it, so we have 
S4 qı 
ds, Ds ay by, Dia a p EZ Vv 3 1 2 


where the subscript of the state is the number selected. We then return to bs, 
where we have 0*(q1, bs,) = 01(sa, b). 
The instructions could be as follows 


O* (si, a) — 6s, (ti, As; 5 R) 


0*(t1, x) = (ti, x, R) for x Æ| 

6*(t1, |) = (h, |, R) 

0* (to, n) = (tb, n’, R) for integer n’ 

0*(t2, m) = (qm, m', L) for first unmarked integer m 

O* (dm, n) = (dm, n’, L) for all marked integers n’ 

8*(Gm, X) = (qm, x, L) if x is an unmarked letter of the alphabet 
O* (din, X) = Om (Sj, œ) if a is marked by s;. 


Informally we state the procedure for testing a word for acceptance by a Turing 
machine as follows: First, given the word, duplicate the word and follow it by 
the first sequence so that we have 


ww | 0. 


Perform the process above for testing the second copy of the word w following 
the sequence #. At the beginning, the machine is positioned at the first letter of 
w. If 0*(t2, #) occurs and the word is not accepted, the end of the sequence has 
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been reached. Erase the symbols between # and |, again duplicate w after the #, 
proceed to the next sequence and repeat the process until the word is accepted or 
the length of the sequence exceeds m”+! 
ning with 6*(s;, œ;) and n is larger than the number of productions beginning 
with any 0*(s;, œ;) in the nondeterministic Turing machine of the word, since 
all possibilities have been tried. Since the new Turing machine which we shall 
call T’ just defined is deterministic and a word is accepted by T’ if and only 
if it is accepted by T, we have shown that a word is accepted by a nondeter- 
ministic Turing machine if and only if it is accepted by a deterministic Turing 
machine. 

We finally conclude that if a word is context-free, it is accepted by a Turing 
machine. 


, where there are m productions begin- 


Exercises 


(1) In Theorem 5.2 write a subroutine for producing the sequence of integers 
used for showing that a language accepted by a nondeterministic Turing 
machine can be accepted by a deterministic Turing machine. 


Find Turing machines (not necessarily deterministic) that accept the 
context-free languages. 
(2) The language containing twice as many as as bs. 
(3) The language containing the same number of as and bs. 
(4) The language {a"b” :n=1,2,...}. 
(5) The language {a"b‘c” : k,n =1,2,...}. 
(6) The language of palindromes of odd length on the alphabet {a, b, c}. 
(7) The language of palindromes of even length on the alphabet {a, b, c}. 
(8) The language of all palindromes on the alphabet {a, b, c}. 
(9) The language {a"b"a"b” : m,n =1,2,...}. 
(10) The language {a"b"a" be" : m,n =1,2,...}. 


5.3 The halting problem for Turing machines 


One of the more frustrating problems running a computer problem occurs when 
the computer continues to run with no end in sight. One has the dilemma of 
deciding whether the computer has just not finished the problem, and perhaps 
in five minutes or five hours it will finish the problem, or if it is in a loop and 
will continue to run forever. This would be particularly true if the machine were 
as inefficient as a Turing machine. It would be nice if one could determine in 
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advance whether the machine was going to halt, which is equivalent to solving 
the problem. Assuming Church’s Thesis, if there was an algorithmic step by 
step way of determining whether a machine was going to halt, then a program 
could be written for a Turing machine that could determine whether a machine 
was going to halt. 

This particular problem is called the halting problem and is formally stated 
as follows: 


Halting Problem Is there an algorithm which will determine whether, for 
any given Turing machine T and any input string w, the Turing machine T, 
given the input string w, will reach the halt state? 


Before answering this question, we look at some related properties of a 
Turing machine. When we were looking at acceptance of regular languages 
by a Turing machine, a word was accepted if the machine reached the halt 
state. Otherwise, the machine crashes, hangs, or loops. Any language which is 
accepted in this manner is called Turing acceptable. 

It would be nice if the Turing machine, when a word was read by it, would 
print Y at the beginning of the tape if the word were in the language and N if 
the word were not in the language. 


Definition 5.5 A language L is Turing decidable if there exists a Turing 
machine that, when a string is input, prints Y on the tape if the word is in L 
and N if the word is not in L. 


Theorem 5.3 Jfa language is Turing decidable then it is Turing acceptable. 


Proof Ifa language L is Turing decidable, then there is a Turing machine that 
prints Y if the word is in L and N if the word is not in L. Modify this machine 
so that instead of printing N, it goes into an infinite loop and instead of printing 
Y, goes into the halt state. Thus the new machine halts if a word is in L and 
goes into an infinite loop if the word is not in L. Thus L is Turing acceptable. 














Theorem 5.4 [fa language L is Turing decidable, then its complement L' = 
A* — L is Turing decidable. 


Proof Ifa language L is Turing decidable, then there is a Turing machine that 
prints Y if the word is in L and N if the word is not in L. Modify this machine 
so that instead of printing Y, it prints N and instead of printing N, it prints Y. 
This new machine prints Y if the word is in L’ and N if the word is not in L’. 
Thus L’ is Turing decidable. 
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Theorem 5.5 A language L is Turing decidable if and only if both L and L' 
are Turing acceptable. 


Proof Ifa language L is Turing decidable then by Theorem 5.4, its comple- 
ment L’ is also Turing decidable. But by Theorem 5.3, L and L’ are then both 
Turing acceptable. 

Conversely, if L and L’ are both Turing acceptable, then there are machines 
M and M’ that accept languages L and L’ respectively. Place the input string 
in both M and M”. If the input string is accepted by M, then print the letter Y. 
If the input string is accepted by M’, then print the letter N. Since this process 
is algorithmic, by Church’s Thesis, it can be duplicated by a Turing machine 
M”. Hence L is Turing decidable and, by Theorem 5.4, its complement L’ is 
also Turing decidable. 














Before proceeding further we need to show that every Turing machine with 
alphabet A = {a, b} can be uniquely described by a string of as and bs. It is 
obvious that a Turing machine is uniquely determined by the set of rules for 
the machine. We shall show this for the set of states S = {s1, 52, 53, ..., Sn}. It 
may be recalled that a rule has the form 


(s;, a, Sj, b, L) 


where the first and third components are states, the second and fourth compo- 
nents are letters of a set of tape symbols I which contains the alphabet. We may 
also have #, and A, which may be used as a marker in I’. The last component is 
either L or R. At times we have also included # in the last component, but this 
was not really necessary. It was merely used in the halt statement to indicate 
that the machine had halted and so had ceased moving. We proceed with the 
encoding as follows: If the first component is s;, we begin with a string of as 
of length i. Thus if the first component is s3, we begin with aaa. We follow 
this with a b, which is used as a divider. For the symbols a, b, #, and A, we 
add to the string aa, bb, ab, and ba respectively. Thus if the first component is 
s4 and the next component is a, then our string at this point is aaaabaa. We 
follow this with the string of as corresponding to the state s; and then another 
b for a divider. We include a for L and b for R as the fourth component. Thus 
the string for (s4, b, s2, #, R) is aaaabbbaababb, where aaaa represents s4, b 
is a divider, bb represents b, aa represents s2, b is a divider, ab represents #, 
and b represents R. Once we have a string for each rule, we then concatenate 
or connect all of the strings together to form one long string of as and bs that 
represent the Turing machine. 
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Itis also possible to decode the string. For example suppose we had the string 
ababaaabbbb ..., the first a represents s;, the b is a divider, ab represents #, 
aaa represents s3, b is a divider, bb represents b, and b represents R. We have 
decoded the rule (s1, #, 53, b, R) and we continue reading the string to get the 
next rule. Denote the string that represents the Turing machine M by c(M). 

Suppose we want to display a string that represents a Turing machine fol- 
lowed by input to be read by the machine. This is easily done by taking the 
string for the Turing machine M followed by a b and then followed by the input. 
Since no rule starts with a b, finding a b would indicate to the decoder that data 
followed rather than another rule. Thus the string representing machine M with 
input w is c(M)bw. 

Consider the language Lo which consists of all strings c(M)bw representing 
a Turing machine M followed by an input word w where the Turing machine 
accepts that input word. It is simple to construct a Turing machine MM that 
accepts Lo. Given a string t, MM first decodes ¢ and if it represents a Turing 
machine M followed by input data, it inputs the data into the machine M, 
which can be recovered from the string t, and MM accepts the input c(M)bw 
if and only if M accepts the input string t. Therefore Lo is Turing acceptable. 
If, in addition Lo is Turing decidable, then every Turing acceptable language 
is Turing decidable. To show this we know that if Zo is Turing decidable, then 
there exists a Turing machine, say M Mp, which, given any input string w, will 
print Y if w is in Lo and N if w is not in Lo. Assume that we have a Turing 
acceptable language L, then it is accepted by a Turing machine M(L). We can 
now construct a Turing machine M’(L) which, given an input string s, prints Y 
if s isin L and N if s is not in L. The Turing machine M’(L) is constructed by 
simply taking the string c(M(L)), adding the input string s to form c(M(L))bs 
and then using it as an input string s’ for M M2. If M M, prints Y for the input 
s’ then machine M(L) accepts s, so M’(L) prints Y. If MM) prints N for the 
input s’ then machine M(L) does not accept s, so M’(L) prints N. Thus L is 
Turing decidable. It thus follows that every acceptable language is decidable if 
and only if Lo is decidable. 

We now show that Lo is not Turing decidable. We claim that if Lo is Turing 
decidable then the language L; = {c(M) such that M accepts c(M) is Turing 
decidable}. To show this, assume that Lo is Turing decidable. We construct Mı 
as follows: Given an input string s, we simply take the string sbs and use it 
as input for MM). If M M, prints Y then s = c(M) for some M that accepts 
c(M). Therefore s € Lı and M, prints Y. If M M, prints N then s 4 c(M) for 
any machine M that accepts c(M), so that s ¢ Lı and M; prints N. Thus L; is 
Turing decidable. 
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We now show that L; is not Turing decidable. Since by Theorem 5.4, Lı 
is Turing decidable if and only if Li is, we shall prove that L‘ is not Turing 
decidable. The language Li = {{w : w € {a, b}*} and either w 4 c(M) for any 
machine M or w = c(M) for some machine M but M does not accept w}. Here 
we are reminded of Russell’s paradox. Let M; be a machine that accepts L4, 
then is it true that c(M{) € L‘? If so then M; does not accept c(M}) by definition 
of Li. But M; does accept c(M}) because it accepts every element of L4, and 
we have a contradiction. Conversely assume c(M}) ¢ L1. Then c(M{) € Lı so 
that M; accepts c(M{) by definition of L1. But M; only accepts elements of L4 
so that c(M{) € L‘, again a contradiction. Hence L; is not Turing acceptable 
and certainly not Turing decidable. 

Since we have shown that Lo is Turing acceptable but not Turing decidable, 
we have the following theorem: 


Theorem 5.6 There exists a language that is Turing acceptable but not Turing 
decidable. 


Since Lo is Turing acceptable but not Turing decidable, the following theo- 
rem follows from Theorem 5.5: 


Theorem 5.7 There exists a language which is Turing acceptable but whose 
complement is not Turing acceptable. 


We have also solved the halting problem since a string is acceptable if and 
only if the machine reaches the halt state. Hence the algorithm that would satisfy 
the halting problem is the algorithm which describes M Mp and it does not 
exist. 


Theorem 5.8 Given a Turing machine T and an input string w, there is no 
algorithm which will determine whether the Turing machine T, given the input 
string w, will reach the halt state. 


Exercises 


(1) Show that a finite set is Turing decidable. 
(2) Find the string representing the rule (s5, A, s2, a, R). 
(3) Find c(M) where M is the machine defined by the rules 


(81, 4, 52, a, R) (s1, b, s2, b, R) (s2,a, 52,4, R) 
(s2, b, s2, b, R) (s1, #, 53, #, R). 
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(4) Find c(M) where M is the machine defined by the rules 


(51, a, So, #, R) (51, b, 82, a, R) (52, a, So, #, R) 
(s2, b, 52, a, R) (s1, #, 53, #, R). 
(5) Find the rule that corresponds to the string aaabababbaa. 
(6) Find the rule that corresponds to the string aabbbaaabbab. 
(7) Which of the following strings correspond to rules? 
(a) baaabbaabb 
(b) aabbbaaabbbb 
(c) aababaabbaa 
(d) aabaabaabaab 
(e) aabaaabbabb. 
(8) Find the Turing machine that corresponds to the string 


abaaabbbbaabbbabaabababbaababb. 
(9) Find the Turing machine that corresponds to the string 
abaaaababbabbbaababbaababaaababb. 
(10) Find the Turing machine and input that correspond to the string 
abaaaabbbbabbbaabbbbaabbbabbbaababaaababbbaaabbb. 
(11) Find the Turing machine and input that correspond to the string 


abaaaabbbbabbbaabaabaabbbabbbaaabaaabbbaaababaa 
ababababaabb. 


(12) Devise a method of coding that allows the use of A and B as well as 
a and b by allowing strings of length 3 to represent input and output 
symbols. 

(13) Use the coding in the previous problem to find the string corresponding 
to (s1, a, 53, A, R). 

(14) Find the string that represents the machine 


(81, a, s2, b, R) (s1, b, s2, b, R) (s2,a, s2, #, R) 
(s2, b, S2, b, R) (s1, #, 53, #, R) 


together with input ababaab. 
(15) Find the string that represents the machine 


(s1, 4, 52, b, R) (51, b, 52, a, R) (s2, a, S2, #, R) 
(s2, b, s2, #, R) (s1, #, 53, #, R) 


together with input babbab. 
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(16) Let L be a language. Prove that one and only one of the following must 
be true: 
(a) Neither L nor L’ is Turing acceptable. 
(b) Both L and L’ are Turing decidable. 
(c) Either L or L’ is Turing acceptable but not Turing decidable. 


5.4 Undecidability problems for context-free languages 


We begin with the Post’s Correspondence Problem, which is not only an interest- 
ing problem in itself, but is used to prove that certain statements about context- 
free languages are undecidable. 


Definition 5.6 Given an alphabet £, let P be a finite collection of ordered 


pairs of nonempty strings (uy, V1), (U2, V2), ..., (Um, Um) Of X. Thus P is 
a finite subset of ut x Xt. A match of P is a string w for which there 
exists a sequence of pairs (Uui, Vi), (Ui, Vir), ---, (Uins Vin) such that w = 


Uj, Ui, ++ * Uj, = Vi Vip +++ Vi,- Post’s Correspondence Problem is to determine 
if a match exists. 

An alternative way to think about Post’s Correspondence Problem is to 
consider two lists A = u1, u2, ..., Un and B = vj, V2,..., Un where each uj; 
and vi is a nonempty string of & and there is a match if there exists w such 
that W = Uj, Uj, +++ Ui, = Vi Vi, .. . V;,,- The important factor is that the products 
must consist of corresponding pairs. 


Example 5.1 Let P = {(a, ab), (bc, cd), (de, ed), (df, f)}, then abcdedf 
and abcdededf are both matches of P. 


We wish to show that Post’s Correspondence Problem is not decidable. To 
help us do so we define a modified correspondence system. We shall show that 
if the modified correspondence system is not decidable, then Post’s Correspon- 
dence Problem is not decidable. Finally we show that the modified correspon- 
dence is not decidable. 


Definition 5.7 Given an alphabet X, Let P be a finite collection of 
ordered pairs of nonempty strings (u1, V1), (U2, V2), . . . , (Um, Um) of È together 
with a special pair (up, vo). In a modified correspondence system, a 
match of P is a string w such that there exist a sequence of pairs 
(uo, Vo), (Ui, Vi), (Ui, Vin), --+5 (Uins Vip) such that w= ug; Ui, +++ Ui, = 
VOU; Vi, - - . Uj,,- Thus a match must begin with the designated pair (uo, vo). The 
modified Post’s Correspondence Problem is to determine if a match exists in 
a modified correspondence system. 
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Note that in the previous example, any match w must begin with (a, ab), 
however P is not a modified correspondence, since we are not required to begin 
with (a, ab) to try to form a match. 


Lemma 5.1 /[f Post’s Correspondence Problem is decidable, then the modified 
Post’s Correspondence Problem is decidable. 


Proof Let Pı be a modified correspondence system with the sequence of 
ordered pairs (ug, vo)(u1, V1), (U2, V2), ---; (Um, Um) and the alphabet & consist 
of all symbols occurring in any of u; or v;. Assume every match must begin 
with uo and vp. Assume also that x and $ do not occur in ©. For a string 
w = a,d2a3--- ax, define L(w) = «a, x az x a3--- «ag and R(w) = a, x az x 
a3 x- - - agx. Let P) contain the pair (L(ug), L(vo)*), and for all other (u, v) in 
P,, let (L(u), R(v)) belong to P2. In addition include, (x$, $) in P2. Itis obvious 
that only (L(uo), L(uo)*) can begin a match in P3, since it is the only pair where 
we do not have one word in the pair beginning with a star while the other does 
not. It is also obvious that the only pair that can end a pair in P2, is (*$, $), 
since it is the only word where the last symbols match, that is we do not have 
one ending in a star while the other does not. 
It is also obvious that if there exist a sequence of pairs 


(uo, Vo)(Ui,s Vi), (Uiz, Viz), «++ (Uins Vim) 
in P4, such that w = Uoi Ui, --- Ui, = VOVi Vi, ° + U;,,- Then the sequence 
(L(uo), L(vo)(L (ui), R(vi)), (Elui), Riz), -~ , (Llin), Rin )), #8, $) 
produces a match 
w' = Luo) Lu) Li)... Lui.) * $ = LCuo) * R )R O)... $ 
in P2. The words 
L(up)L(u;, Li) -- Lig) * $ 
and 
L(vp) * RW )Rn). -$ 


in P, differ from the words uou; Ui, +++ Ui, and vovi Vi, «++ Vi, respectively in 
P, in the fact that that they have stars between the letters and end in $. 

Hence, since a match in the modified Post’s correspondence system has a 
corresponding match in Post’s correspondence system, if Post’s Correspon- 
dence Problem is decidable, then the modified Post’s Correspondence Problem 


is decidable. 
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Example 5.2 Using the previous modified Post’s correspondence 
Pı = {(a, ab), (bc, cd), (de, ed), (df, f)} 
with match abcdedf , we have 
P, = {(xa, xa * bx), (kb * c, c x dx), (xd x e, e x dx), (xd x f, fx), ($, $)} 
with match xa *bxc*xdxexdx f x§. 
Theorem 5.9 Post’s Correspondence Problem is undecidable. 


Proof We show that Post’s Correspondence Problem is undecidable by show- 
ing that the modified Post’s Correspondence Problem is undecidable. We do 
this by showing that if the modified Post’s Correspondence Problem is decid- 
able, then Lo (see previous section) is acceptable, which means that it is 
decidable if a Turing machine accepts a given word. Assuming the sequence 
for a given Turing machine and word, we construct a modified Post’s corre- 
spondence system that has a match if and only if M accepts w. Intuitively 
assume 


#syw#a 5) Bi Ho2S82 Bot... Hays, Bt 


describes the process used by the Turing machine to read w, where each w, and 
each of the a; and 6;, are strings, the Turing machine begins in state sọ. Each of 
the following steps describes the process for the machine accepting w. Hence 
between the spaces, each string in the match represents symbols on the tape 
and the state of the machine for each step as the Turing machine progresses in 
its computation. We wish to create a modified post’s correspondence system 
which has this description. We shall see that the overlapping produced by the 
rules below together with the fact that the top and bottom row must match give 
us the process described above. 

Note that reaching an acceptance state is equivalent to reaching a halt state 
since as above, we can create rules that take us from an acceptance state to the 
final state. 

For a given Turing machine and word, we create the modified Post’s 
correspondence system as follows. We shall use two rows to represent the 
first coordinates and second coordinates respectively. The following are the 
rules we are allowed to use. We begin with the pair (#, #sgw) so that 
we have 














#sqw 
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Since this is a modified Post’s correspondence system, we can require that we 
begin with this pair. For each X in I we have 





X 
X 














We next use the following pairs to guide us in selecting the next string in our 


match: 
For each state s, which is not a final state and each state s’, and symbols X, 


Y,and Z inT, 


























X 
sE if &(s, X) = (s', Y, R) 
Ys 

XsY ; 

s'XZ) if 5(s, Y) = (s', Z, L) 
s# . y 

o if 6(s, #) = (s’, X, R) 
Xs# : i 
a if 6(s, #) =(s’, Y, L). 








We shall call these the pairs generated by ô. 
In trying to get our match this set guides us to the next string. For example 
if we have 
...# 
...#l1s;011# 


in our match and one of the pairs above is (s;0, 1s;), we will want the next 
string to be #1115; 11#. Note however that the two 1s at the beginning and end 
of the string are not affected by the pair above. Hence we need pairs (#, #) and 
(1, 1) to get 


_. #1 15;011# 
_. AL L501 1#11 19; 114. 
15011 # 


. Hence we need pairs 





More precisely we would use —, —, yo 
1 1 1s; 1 1 # 


for all X in T 


$I |> 


Obviously if we never get to an acceptance state (and hence a final state) we 
will never have a match since there will always be an overlap at the bottom. We 
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thus need rules to get a match if we reach a halt state h. We use the following 
pairs to get rid of the overlap. 


(O50, Sm) 
d, 1) 
GE, #) 
(O5m1, Sm) 
(150, Sm) 
(151, Sm) 
(OSm, Sm) 
(SmO, Sm) 
(15m, Sm) 
(Sml, Sm) 
(Smit, #). 
The last term gets rid of the overlap Sm when all of the other symbols have been 
eliminated. Thus if we reached 
... #115,011# 
... #l1s,O11#111s,, 11# 


Ism1 1s, 1 # Sim ttt 
rules ; +>, —, and would produce 
Sn Sm 1 # # 


© HLL 9; 01 111 Ly 11#1 Ly 11 Sy Hy HH 
© HLL 9; 00 TEDL 1sm 11#1 15sm 11 Sn Hn HH 





as follows 
... #115;,011# 
.. #115,011#1115,, 114 


.. #115,011#1 
. #LLs OLL#I115,, 11#1 


..#11s;011#11 
..#11s;O11#111s„11#11 


..#11s;O11#111sm1 
.-#11s;011#111sm11#115m 


.. #11s,OL1#I11s,, 11 
. #LLs OLI#I115,, L1#115,,1 
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. #11sO11#1115,, 114 
. #11s,O11#1115,, 114115, 1# 


. #11s,011#1115,, L1#1 
.#11s;011#111sm11#11sm1#1 


..#11s;O11#111sm11#11sm1 
. #11s,O1L1L#1115,, L1#11s,, l#15,, 


© HL L901 LT Ly 1 LHL Lyn 1# 
_ #1190111 1 Ly L1#T Ly LL 5 # 


HLI OLIHI 1sm LLL D5 LD Syn 
_ HL Ls, OL LLL 1sm 11#1 15m HDS #5 


_ #11901 111 Ls) L1#T Ly 11 5 # 
© HL Lg OL TDL Ism LHD lsm LD Sin Sn 


.. #11s,OL1L#111s,, 114115, 1#15,, 75,74 
. #L1sOLI#I11s,, L1#11s,, l#1 5, #5), H. 

Formally we give a proof of the theorem. If we have a valid set of sequences 
describing the acceptance of w by M, using induction on the number of com- 
putations we show that there is a partial solution 

#sqyw#a, sı Bi #a2S9 Ba# . . . #y—15n—1Bn—1# 
#50 WHOS] Belz 80 Bot tee #On—1Sp— 1 Pn- 1#Oln Sp Britt ` 





For n = 0, we have 





# 
#Sow 














Assuming the statement is true for k, and sz is not the halt state we have 


#sow#a sı b1 #æ2s0b2# kaa #An—1Sk—1bk-1#Ë 
#sowHa 51 By #01250 Bott... #ak—1Sk—1bk-1#AkSk prH 





The next pairs are chosen so the string at the top forms #a;.5; 6,# using the rules 
above. There is at most one pair in the pairs generated by 6 that works. 
We can thus form 
#5 wie, 5 By #250 Bot . . . HO, —1Sp—1 bk- Hy Sy Bt 


#sowHai sı Bi #ol2So Po . . . #Ak—1Sk—1 Be_1 Hag Sx Br HOlK + 1Sk41 bk1 





and we have extended a new partial solution. Since rules generated by ô apply 


1 0 
to only one letter, rules i and 0 may be needed to produce a, and 6g. If M 
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starting with #sqw reaches a halt state, there is a rule to get from #a;5; 6,# to 
#Ak+1Sk+1 Êk+1# otherwise, for some k, there is not a rule and there can be no 
match. If, for some k, 6; is a halt state, then as mentioned above, there are rules 
to make the upper and lower lists agree. 

As already mentioned, if we do not reach the halt state, we cannot have 
a match. If we do reach the halt state, we can produce a match. Hence if 
the modified Post’s Correspondence Problem is decidable, Lo is decidable. 
Therefore the modified Post’s Correspondence Problem is undecidable. 














Example 5.3 Let the Turing Machine 
M= ({so, 51,52, h}, {0, 1}, {0, 1, x, #}, 5, 50, h) 


and word 0110 where 


ô(so, 0) = (s1, *, R) 
8(so, 1) = (s1, 1, R) 
ôlsı, 1) = (s1, 1, R) 
8(s1, 0) = (s2, 0, L) 
ô(s2, 1) = (s2, 1, L) 
ô(s2, 0) = (s2, 1, R) 
ô(s2, #) = (h, #, #) 


with corresponding pairs 


(s00, xs1) 

(sol, 1s1) 

(sıl, 1s1) 
(05,0, s200) 
(15,0, s210) 
(*s10, 52 * 0) 
(Os21, s201) 
(1821, s211) 
(*sS1 1, So * 1) 

(s20, 1s2) 

(sox, hO). 


In addition we have pairs 


(0, 0) 
(1, 1) 
(#, #) 
(x, x). 
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Our first pair is (#, #sqw) which produces 


# 
#590 10#. 


We now use (s00, xs1) to get 


#50 
#50010# x s1. 


We then use (1, 1) twice, (0, 0), and (#, #) to get 


#s00110# 
#s0o0111# x sı 110#. 


We next use (x, x), (s11, 1s1), (1, 1),(0, 0), and (#, #), to get 


#5901 10# x sı110# 
#so0110# x sı 110# x 1s, 10#, 


again using (x, x), (1, 1), (s11, 1s1), (0, 0), and (#, #) we get 


#so0110# x 5; 110# x 1s, 10# 
#5001 10# x s1110# x 1s; 10# x 115,08. 


Now using (x, *), (1, 1), (1s10, s210), and (#, #) we get 


#5001 10# x 5; L1OH x 1s) 10# x 115, 0# 
#5001 10# x 5) LLOH x Ls) LOH x 115, 0 x 152108. 


Using (x, x), («51 1, 52 * 1), (1, 1), (0, 0), and (#, #) we get 


#5901 10# x 5) 110# « Ls) 10# x 11s10#x 1s210# x s2110# 
#s00110# x sı 110#x 1s1 LOH x 11s10#x 1s210# x s2110#s2 x 110#. 


Now using (s2x, h0), (1, 1) twice, (0, 0), and (#, #), we get 


#501 10# x sı 110# x Ls, LOH x 115,08 * 1s210# x s2110#s2 x 110# 
#s0o0110# x sı 110# x 1s1 LOH x 11s10# x 1s210# x s2110#s2 x 110#h0110#. 


Finally, using the pairs containing h, together with (1, 1), (0, 0), and (#, #), we 
get 


#s00110# x s1110#x 1s1 10# x 11s10#x 1s210# x s2110#s2 x 110#h0110#h 
#s00110# x s1110#x 1s110# x 11s10# x 1s210# x s2110#s2 x 110#h0110#h. 


We can now use the fact that Post’s Correspondence Problem is undecidable 
to solve several other questions about solvability with regard to context-free 
languages. 
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Theorem 5.10 Itis undecidable for arbitrary context-free grammars G; and 
G2 whether L(G1) O L(G2) = Ø 


Proof Let P C X&* x &* be an arbitrary correspondence system with pairs 
(uo, Vo), (u1, V1), (U2, V2), ..., (Un, Vn). In the following, w7! will be w with 
the letters reversed. For example 11017! is 1011. Let G; be generated by 
productions 


1 


S— ujCv; fori = l ton. 
C > u;Cv;' fori = l ton. 
C> 


Thus every word in L(G;) has the form UigUi Ui, ...Ui,,CV;, en vp up vp. 

Let L(G2) = {wcw7!]| w € E*}. Then w € LGDN L(G2) if and only if 
W = UigUi Ui, ... Ui, = Vig Vi, Vir... Vi, Which is a solution to the Post’s corre- 
spondence system. Hence it is undecidable for arbitrary context-free grammars 


G, and G, whether L(G,)N L(G2) = Ø 














Definition 5.8 A context-free grammar is ambiguous if there are two leftmost 
generations of the same word. 


Example 5.4 Let rT = (N, £, S, P) be the grammar defined by N = 
{S, A, B}, & = {a, b}, and P be the set of productions 


S>aSb S—>aA A—Bb A—>aA B—Bb Boi Si. 
Obviously a”b” can be generated in two different ways. 


Theorem 5.11 /t is undecidable whether an arbitrary context-free grammar 
is ambiguous. 


Proof Let P C &*+ x Xt be an arbitrary correspondence system with pairs 
(Ug, Vo)(Uy, V1), (U2, V2), ---, (Un, Vn). Let œo, 1, H2,..., Œn be symbols not in 
=x*. We construct two grammars G; and G2 as follows: 


G = (Ni, Xa, Si, Pi) 


where N; = {S1}, Ea = È U {ao, @1, @2,..., Œn}, and Pi = {S1 > a; Siu; for 
i=0,1,...,n, and S; —> A}. 


G2 = (N2, Xa, Sz, P2) 


where N2 = {$2}, Ea = È U {ao, a, @2, ..., An}, and P2 = {S2 > a; Sav; for 
i=0,1,...,n, and Sy > A}. 
Obviously G; and G3 are not ambiguous. 
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Let G = (N, £a, S, P) where N = {S, S,, So} and P= P} UP» U{S > 
S1, S — S2}. Obviously if there is a match, one derivation begins with S > Sı 
and the other with S — S2 so G is ambiguous. Conversely if G is ambigu- 
ous, then uj,Uj,Uj,...Ui, = Vio Vi, Vi, --- Vi, and there is a match. Hence there 
is a match if and only if the context-free grammar is ambiguous. Therefore 
it is impossible to determine whether an arbitrary context-free grammar is 


ambiguous. 














Exercises 


(1) Show that the class of Turing acceptable languages is closed under union. 

(2) Show that the class of Turing acceptable languages is closed under inter- 
section. 

(3) Show that the class of Turing decidable languages is closed under inter- 
section. 

(4) Show that the class of Turing decidable languages is closed under union. 

(5) Show that the class of Turing decidable languages is closed under con- 
catenation. 

(6) Show that the class of Turing decidable languages is closed under Kleene 
star. 

(7) Show that it is an unsolvable problem to determine, for a given Turing 
machine M, whether there is a string w such that M enters each of the 
machine’s states during its computation of input w. 

(8) Show that it is undecidable for any arbitrary context-free grammar T 
whether r(M) = &*. 

(9) Show that for arbitrary context-free grammars I and T”, it is undecidable 
whether r (L) = r'(L). 

(10) Show that there is no algorithm that determinines whether the intersec- 
tion of languages of two context-free grammars contains infinitely many 
elements. 

(11) Show that there is no algorithm that determines whether the complement of 
the languages of context-free grammars contains infinitely many elements. 
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A visual approach to formal languages 


6.1 Introduction 


Formal language theory is overlapped by a close relative among the family of 
mathematical disciplines. This is the specialty known as Combinatorics on 
Words. We must use a few of the most basic concepts and propositions of this 
field. A nonnull word, q, is said to be primitive if it cannot be expressed in 
the form x* with x a word and k > 1. Thus, for any alphabet containing the 
symbols a and b, each of the words a, b, ab, bab, and abababa is primitive. The 
words aa and ababab are not primitive and neither is any word in (aba)* other 
than aba itself. One of the foundational facts of word combinatorics, which is 
demonstrated here in Section 6.2, is that each nonnull word, w, consisting of 
symbols from an alphabet X, can be expressed in a unique way in the form 
w = q” where q is a primitive word and n is a positive integer. The uniqueness 
of the representation, w = q”, allows a useful display of the free semigroup 
xT, consisting of the nonnull words formed from symbols in £, in the form of 
a Cartesian product, Q x N, where Q is the set of all primitive words in X* 
and N is the set of positive integers. Each word w = q” is identified with the 
ordered pair (q, n). This chapter provides the groundwork for investigations of 
concepts that arise naturally in visualizing languages as subsets of Q x N. In 
the suggested visualizations, the order structure of N is respected. We regard 
N as labeling a vertical axis (y-axis) that extends upward only. We regard Q as 
providing labels for the integer points on a horizontal axis (x-axis) that extends 
both to the left and right except in the case in which the alphabet is a singleton. 
When » is a singleton the unique symbol in & is the only primitive word and 
Q x N occupies only the vertical axis. 

The set Q of primitive words, over an alphabet having two or more letters, is 
a rather mysterious language. It is known that Q is not a regular language and 
that its complement is not a context-free language. (See Exercises 1 and 2 of 
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this section.) At this writing, it is not yet known whether Q itself is context-free. 
Of course Q is clearly recursive and one can confirm that it is context-sensitive. 
With Q itself not fully understood, it is surprising that insightful results can 
be obtained about languages by displaying them in the half plane Q x N. It 
is as though, in constructing displays above Q, we are building castles on 
sand. Nevertheless we proceed by taking Q as a totally structure-less countable 
infinite set and we allow ourselves to place Q in one-to-one correspondence 
with the set of integers (on an x-axis) in any way we wish in order to provide 
the most visually coherent display of the language being treated. One might 
say that we take advantage of our ability to sprinkle the grains of sand (i.e., 
primitive words) along the x-axis just as we please. 

In the next section adequate tools and exercises are given to allow the intro- 
duction of visually coherent displays of languages based on the concept of 
primitive words. 


Exercises 


For this set of exercises let & = {a, b} serve as an alphabet and let Q be the set 
of all primitive words in £+. 


(1) Let Q’ be the complement of Q in =* and note that, for every positive 
integer n, ab"aab"a is in Q’. Prove that the language Q’ is not regular. 
Conclude that the language Q also cannot be regular. 

(2) Let Q’ be the complement of Q and note that, for every positive inte- 
ger n, ab"aab"aab"a is in Q’. Prove that Q’ is not a context-free lan- 
guage. Although complements of context-free languages need not be 
context-free, there is a subclass of context-free languages, called the 
deterministic context-free languages, for which complements are always 
context-free. Conclude that Q cannot be a deterministic context-free 
language. 

(3) For each positive integer n, find three primitive words, u, v, and w, for 
which uv = w”. 

(4) Show that both Q and Q’ are recursive languages. 

(5) Let L be a language that has the property that there is a bound B in N 
for which for every u in &* there is a word v in &* of length at most B 
for which uv is in L. Show that L contains an infinite number of primitive 
words. 

(6) Characterize those regular languages that have the property stated in the 
previous exercise using the intrinsic automaton of L. 
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6.2 A minimal taste of word combinatorics 


Throughout this section, uppercase & will denote a nonempty finite set of 
symbols that will be used as an alphabet, while u and v will be reserved to 
denote nonnull words in X*. Uppercase Q will denote the set of all primitive 
words in X*, while p and q will be reserved to denote individual primitive 
words. The following definition for the division of words provides a convenient 
tool for the present exposition: For each pair of words u, v in EY we define 
u/v to be the ordered pair (n, r) where n is a nonnegative integer and u = v"r 
with r a suffix of u which does not have v as a prefix. We call r the remainder 
when u is divided by v. 

Observe that for each u, v there is only one such pair that meets the conditions 
in the definition. Note that u is a power of v if and only if u/v = (n, A) in which 
case u = v”. 


Proposition 6.1 Words u and v are powers of a common word if and only 
if uv = vu. Moreover, when uv = vu both u and v are powers of the word w 
that occurs as the last nonnull remainder in a sequence of word divisions. The 
length of this w is the greatest common divisor of the lengths of u and v. 





Proof Suppose first that u = w” and v = w”. Then uv = w™w" = wt" = 
w"t™ = vu as required. 

Suppose now that uv = vu. If |u| = |v] then u = v. Otherwise, by the sym- 
metry of the roles of u and v in the hypothesis, we may assume |v| < |u| and we 
observe that v is a prefix of u. Let wọ = u and w; = v. Define successively each 
word w2, w3,..., as the remainder, w;, in the division w;_2/w;_-1 = (nj, wi), 
stopping when the word w = A is obtained. That such a wą = à must arise is a 
consequence of the observation that each w;_; is a nonnull prefix of w;—2 when 
w;_ itself is not null. (See Exercise 3 of this section.) We have in succession: 
Wr-2/We-1 = (ng, à) and wg—2 is a power of we_1, WE-3 is a power Of wg-—1, 
wg—4 is a power Of wg—1, ..., Wi(= V) is a power Of Wz_1, Wo(= u) is a power 
of wz—1. Thus u and v are powers of the common word wg—1. A review of the 
sequence of divisions confirms that |w,_1| = gcd(|u|, |v|). 














From the second paragraph of the previous proof we observe that if u and v 
are powers of a common word, then the length of the longest word w for which 
both u and v are powers of w is the greatest common divisor (gcd) of |u| and 
|v|. Consequently, Proposition 6.1 provides two methods of deciding whether 
two words u, v are powers of a common word; one can test the equality of uv 
and vu. Alternatively one can compute g = gcd(|u|, |u|), test the equality of 
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the prefixes u’, v’ of length g of u, v, respectively, and if u’ = v’ then compute 
u/u' and v/v’ to test whether the remainder in each case is À. 


Proposition 6.2 For each word u in X* there is a unique pair (q, n), with q 
in Q andn in N, for which u = q”. 


Proof Foreachi with 1 <i < |u|, let u; be the prefix of u of length i. Compute 
successively the u/u; until a j occurs at which u/u; = (m, A). Such a j will 
certainly occur since u/u),; = (1, A). For the pair (uj, m) we have uj; in Q. 
Consequently u = u? has the required form. Suppose now that u = p” = q”, 
where both p and q are in Q. 


uu = p” p” = pp™ l p” = pp” p” "= pup” l= pq” p” ! = (pq)! p 1 m— 3 
and 
uu = q"q" = qq" ‘q" =qq"q" | =quq"' = qp" q"! =(qp)p" ‘qr. 


Thus pq = qp and by Proposition 6.1, p and q must be powers of a common 
word. Since each is primitive, p = q and then also m = n, as required for 
uniqueness. 














For each word u the unique primitive word q for which u is a power of q will 
be called the primitive root of u and will be denoted rt(u). Thus Proposition 6.1 
may be rephrased: uv = vu if and only if rt(u) = rt(v). Note that for each 
word u, u =rt(u)" for a unique n in N. Thus for each positive integer m, 
u” =rt(u)”. Thus rt(u”) = rt(u) for each word u and each positive integer 
m. The exponent n in the unique representation u = rt(u)" will be called the 
exponent of u. 

Propositions 6.1 and 6.2 are the bedrock of the theory of word combinatorics 
and should become familiar tools for anyone studying formal languages. They 
contain only the information required to begin the discussion of language visu- 
alization. The additional information about word combinatorics that is required 
in the algorithmics of visualization is given in Section 8 of this chapter. For 
further study of the fascinating but subtle mathematics of word combinatorics 
see [36] [35] [28] and [13]. 


Exercises 


(1) Let & = {a, b, c} serve as alphabet. Let 


u = abcabcabcabcabcabcabcabcabcabc 
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(2) 


(3) 


(4) 


Wi 


and 
v = abcabcabcabcabcabce. 


Given that u and v commute, carry out the steps of the procedure given in 
the second paragraph of the proof of Proposition 6.1 for finding the longest 
word w for which u and v are powers of w. Give the finite sequence of the 
words wo, W1, W2,..., Wx that arises in this computation. 

Determine whether the words u and v as in the previous exercise are powers 
of a common string w by each of the two methods stated in the paragraph 
following Proposition 6.1. Note that the second method uses Euclid’s num- 
ber theoretic algorithm for finding greatest common divisors of integers 
and produces the longest such w when u and v are powers of a common 
string. The first method given produces the shortest such w when u and v 
are powers of a common string. 

The proof of Proposition 6.1 contains the following assertion: “That such 
a wą = A must arise is a consequence of the observation that each w,_ 
is a nonnull prefix of w;-2 when w;—; itself is not null.” Prove the obser- 
vation that each w;—; is a nonnull prefix of w;—2 when w,_ itself is not 
null. 

Observe that the length function | | : E+ —> N is a semigroup homomor- 
phism that maps £* onto the additive semigroup N. Let u and v be any two 
nonnull words in £+ for which uv = vu. Show that the restriction of the 
length function to the subsemigroup {u, v} is a semigroup homomorphism 
|| : {u, v}* — N that maps {u, v}* one-to-one into the additive semigroup 
N. What fails in your argument if uv A vu? 


6.3 The spectrum of a word with respect to a language 


th each language L and each nonnull word w we define a subset of the 


positive integers N called the spectrum, Sp(w, L), of w with respect to L : 
Sp(w, L) = {n € N : w” isin L}. In the display of L in the half plane Q x N 
the column at each primitive word, q, displays the spectrum of q. In fact, 
the display of the spectra of the words in Q constitutes the representation of 
L within Q x N. It is the spectra of the primitive words that are of primary 
concern (since the spectrum of w” can be read directly from the spectrum of w). 
However, in Section 6.9 the value of defining the spectra of the nonprimitive 
words along with the primitive words is justified. 


It is convenient to classify spectra into five qualitatively distinct categories. 


The spectrum of a word with respect to a language L may be the empty set, 
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a finite set, a cofinite set, the entire set N, or an intermittent set. When the 
spectrum is the entire set N we say that the spectrum is full. Recall that a set 
is cofinite if it has a finite complement. When a spectrum is empty it is also 
finite and when a spectrum is full it is also cofinite. By an intermittent spectrum 
we mean a spectrum that is neither finite nor cofinite. Note that if Sp(w, L) is 
intermittent then, for every positive integer n, there are integers i > n and j > 
n for which w! € L and w’ is not in L. 

Each of the five cases is easily illustrated using the one letter alphabet 2 = 
{a}: Sp(a, the empty language Ø) is empty. Sp(a, {a, aaa}) is the finite set 
{1, 3}. Sp(a, a V aaa) is the cofinite set {n € N : n = 1 orn > 3}. Sp(a, at) 
is the full set N. Sp(a, (aa)*) is intermittent, being {n € N : n is even}. The 
single letter a is the only primitive word for the alphabet & = {a}. Spectra of 
nonprimitive words for these same languages are illustrated: Sp(aa, Ø) = Ø, 
Sp(aaa, {a, aaa}) = {1}, Sp(aa, a V aaat) = {n € N : n > 2}, Sp(aaa, at) 
is full, and Sp(aaa, (aa)t) = {n € N :n is even}. Note that, in the case of one 
letter as alphabets, such as & = {a}, the distinction between a language and 
the spectrum of the letter a is somewhat artificial. Consider now the two letter 
alphabet & = {a, b}. The spectrum of each word w with respect to the context- 
free language L = {w € Xt : a and b occur equally often in w} is either empty 
or full; Sp(w, L) is full if w is in L and empty otherwise. For the regular 
language L = (aa)* v (bbb)*, Sp(a", L) is full if n is even and intermittent if 
n is odd. Sp(b", L) is full if n is divisible by three and intermittent otherwise. 
Finally, Sp(w, L) = Ø if both a and b occur in w. 


Exercises 


(1) Let L be a regular language in £+ and let N be represented in tally notation 
N =|*. Show that, for any word w in £*, Sp(w, L) is a regular language 
in N =|". 

(2) Let L be a context-free language in X* and let N be represented in tally 
notation. Show that, for any word w in E*, Sp(w, L) is a regular language 
in N = |t. 

(3) Let & be a finite alphabet that contains at least two symbols. Let w be one 
specific fixed word in &*. For each of the following sets state whether the 
set is countably infinite or uncountably infinite: 

(a) {L : L is a language contained in =}, 

(b) {Sp(w, L) : L ranges through all the languages in E+}, 

(c) {Sp(w, L) : L ranges through all the regular languages in £+}, and 
(d) {Sp(w, L) : L ranges through all the context-free languages in 4+}. 
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6.4 The spectral partition of £* and the support of L 


Let L be a language contained in ©*. This language L provides an equivalence 
relation, ~, defined for words u and v in E+, by setting u ~ v provided u and v 
have identical spectra, i.e., Sp(u, L) = Sp(v, L). We call the partition provided 
by ~ the spectral partition, P(L), of =*+ induced by L. This partition is a 
fundamental tool for the present study. In Section 6.7 it is observed that, when L 
is regular, P(L) consists of a finite number of constructible regular languages. 
Using a refinement of P(L) and Theorem 6.1 gives a precise view of L, within 
Q x N, when L is a regular language. The spectral partitions determined by 
the languages discussed in Section 6.3 are given next as examples. 

Let © = {a}. For the language L = £+, the spectrum of every word 
in Xt is full. Consequently P(L)= P(£*) consists of a single class, 
i.e, P(£t)= {£t}. For the empty language, Ø, the spectrum of every 
word in Xt is Ø. Thus P(@) also consists of the single class {X*}. 
For L = {a, aaa}, P(L) = {{a}, {aaa}, &*\L}. For L = a V aaat, P(L)= 
{{a}, {aa}, aaa*}. For L = (aa)t, P(L) = {{a" : n is odd}, {a” : n is even}}. 
Now let © = {a, b}. For L = {w in &* : a and b occur equally often in w}, 
P(L) = {L, S*\L}. For L = (aa)t v (bbb)t, P(L) = {L, a(aa)*, b(bbb)* v 
bb(bbb)*, U*abx* v U*bax*}. 

For visualizing a language, L, within Q x N, the spectra of the primitive 
words in ©* provide the whole picture. If desired, the spectrum of a nonprim- 
itive word, q”, can be obtained from the spectrum of its primitive root, q. In 
fact, for the task at hand here, there is little motive for interest in the spec- 
tra of individual nonprimitive words. For each equivalence class, C, in P(L) 
we are actually only interested in C N Q. The single reason for providing the 
definition of the spectra of nonprimitive words is that each resulting spectral 
class, C, can often provide satisfactory access to the crucial set of primitive 
words C N Q. The first three crucial questions we ask about a set C N Q are: 
(a) Is C N Q empty? (b) If not, is C N Q infinite? (c) If C N Q is finite, can its 
elements be listed? These questions are answered for the languages discussed 
in the previous paragraph in order to provide examples. 

For a one letter alphabet, & = {a}, the letter itself is the only primitive 
word. Consequently for any language L contained in +, C N Q is empty for 
each C other than the one containing the letter a. Now let X = {a, b}. For 
L = {w in &* : a and b occur equally often in w}, each of the two classes 
in P(L) = {L, &*\L} contains an infinite number of primitive words. For 
L =(aa)* v (bbb)*, we previously obtained P(L) = {L, a(aa)*, b(bbb)* V 
bb(bbb)*, X*ab&X* v X&*ba=X*}. For these four spectral classes we have: 
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LQ is empty; (a(aa)*)N Q = {a}; (b(bbb)* v bb(bbb)*) N Q = {b}, and 
(x*abx* v X*bad*) NO Q is infinite. 

For each language L contained in E+, the set Su(L) = {q € Q : Sp(q, L) is 
not empty} will be called the support of L. For a one letter alphabet, & = {a}, 
the support of each nonempty language L is & itself. Now let X = {a, b}. 
For the language L = {w in &+ : a and b occur equally often in w}, Su(L) is 
the infinite set L N Q. For L = (aa)* v (bbb)*, Su(L) is the finite set {a, b}. 
The cardinality of the support of a language is of special significance for the 
investigations introduced here. When a support is finite, the specific primitive 
words in the support are desired. 


Exercises 


(1) For È = {a, b} and L = {(ab”)" : m,n € N}: 

(a) determine the spectrum of each of the words ab, abbabbabb, ababb; 
state whether each spectrum is empty, finite, cofinite, full, or intermit- 
tent; 

(b) determine the spectral partition P(L); and 

(c) determine the support Su(L) and state whether it is a regular language. 

(2) Let & = {a, b} and L = {a"b" : n € N}. 

(a) Confirm that the spectrum of each word in X* is either Ø or {1}. 

(b) Determine P(L) and Su(Z). 

(c) For the language LL, determine the spectra of ab, abab, and ababab. 

(d) Describe P(LL) and Su(L L). 


6.5 Visualizing languages 


In order to spell out the visualization of a language L within Q x N, we begin 
with the usual x—y plane with each point having associated real number coor- 
dinates (x, y). We use only the upper half plane, {(x, y) : y > 0}. With each 
integer i and each positive integer we associate the unit rectangle 


RGi,n)={, y):it-l<x<i,n-l<y<n}. 


In this way the upper half plane is partitioned into nonoverlapping unit squares 
{R(i,n):i an integer, n € N}. To visualize a specific language L in Et we 
first identify the set Q with the set Z of integers using any chosen bijection 
B : Q — Z. (The bijection B is chosen only after a study of the spectral par- 
tition of the specific language L has been made, as illustrated below.) Once 
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the bijection B is chosen, each word q” in Xt is associated with (figuratively, 
“placed on”) the unit square R(B(q), n). Finally, the language L is visualized 
by defining, using B, a sketch function S : {R(B(q),n):q¢ € Q,n € N} > 
{Black, White} for which S(R(B (q), n)) = Black if q” is in L and White oth- 
erwise. For each given language L and each bijection B, the resulting sketch 
function is said to provide a sketch of the language L. By the sketch we mean 
the image of the sketch function that provides it. Thus we regard the sketch as a 
half plane in which each of the unit squares is either black or white. Since there 
are many possible choices for B, there may be many possible sketches of L. 
For many languages, coherent sketches can be given by basing the choice of the 
bijection B on a determination of the spectral decomposition of the language. 
Examples follow for which we use the alphabet & = {a, b}. These examples 
suggest several new formal language concepts that we believe are worthy of 
theoretical development. Each definition given in this section follows immedi- 
ately below one or more examples that illustrate or clarify the concept being 
defined. 


Example 6.1 For L = {w in ©* : a and b occur equally often in w}, each of 
the two spectral classes in P(L) = {L, &*+\L} contains an infinite number of 
primitive words. The spectrum of each word in L is full and the spectrum of 
each word in ©*\L is empty. Let B be any bijection for which B(L N Q) = 
{ie Z:i <0} and B(X*\L)N Q) = {i € Z:i = 1}. The sketch provided 
by this choice of B gives a black left quadrant and a white right quadrant. The 
support of this language is the infinite set L N Q. 


Definition 6.1 A language L is cylindrical if, for each word w in E+, Sp(w, L) 
is either empty or full. 


The language L of Example 6.1 is cylindrical. There are numerous “natu- 
rally occurring” examples of cylindrical languages: The fixed language L = 
{w € X* : h(w) = w} of each endomorphism h of &* is a cylindrical reg- 
ular language and so is the stationary language of each such endomorphism 
[16] [15]. Retracts and semiretracts [16][10][3][1] of free monoids are cylin- 
drical languages. Investigations of various forms of periodicity in the theory 
of Lindenmayer systems have led to additional examples of cylindrical 
languages [24][26]. 


Example 6.2 For L = {aa, aaa, aaaa, aaaaaa, bbb, bbbb, ababab} only 
three primitive words have nonempty spectra: a, b, and ab. Let B be any 
bijection for which B(a) = 1, B(b) = 2, and B(ab) = 3. The sketch provided 
by such a B gives a half plane that is white except for the three columns 
above the three primitive words a, b, and ab. The column above a reads, from 


6.5 Visualizing languages 219 


the bottom up, white, black, black, black, white, black, and white thereafter. 
The column above b reads white, white, black, black, and white thereafter. The 
column above ab reads white, white, black, and white thereafter. The support 
of this language is the finite set Su(L) = {a, b, ab}. 


Example 6.3 For L = {(a"b)" : m,n € N,m > n}, the support of L is 
Su(L) = {a"b : m € N}. Let B be any bijection for which, for each m in N, 
B(a™b) = m. The sketch provided by such a B gives a white left quadrant. The 
right quadrant is white above a sequence of black squares ascending upward 
at 45 degrees and black below this sequence of squares. The support of this 
nonregular language is the infinite regular language at b. 


Definition 6.2 A language L is bounded above if, for each word w in X*, 
Sp(w, L) is finite. A language L is uniformly bounded above if it is bounded 
above and there is a positive integer b for which, for each w in X* and each n 
in Sp(w, L), n < b. 


Any finite language, such as the one given in Example 6.2, is necessarily 
uniformly bounded above. An infinite language may also be uniformly bounded 
above (Exercise 4, below in this section) or bounded above without a uniform 
bound, as illustrated in Example 6.3. 


Example 6.4 For L = {a, aaa, aaaaa} U b(a V b)*, each word that begins 
with a b has a full spectrum and each word that begins with an a and con- 
tains a b has an empty spectrum. Let B be any bijection for which B(a) = 1; 
B(Q N b(a v b)*) = {n € N : n > 2}. The sketch provided by such a B gives 
a white left quadrant and a right quadrant that is black except for the col- 
umn above a which reads black, white, black, white, black, and white there- 
after. The support of this regular language is the infinite nonregular set 
Su(L)= LN Q. 


Example 6.5 For L = {(a”b)' : m,n € N, m odd, m > n} U {(a"b)" :m, 
ne N, m even, m < n}, the support of L is Su(L) =atb. Let B be any 
bijection for which, for each m in N, B(a”b) = m. The sketch provided by 
such a B gives a white left quadrant. The right quadrant has a sequence of 
black squares ascending upward at 45 degrees. For each odd positive integer 
m, (a”b)" is black for n < m and white for n > m. Whereas, for each even 
positive integer m, (a”b)" is white for n < m and black for n > m. 


Definition 6.3 A language L is eventual if, for each word w in X*, Sp(w, L) 
is either finite or cofinite. The language L is uniformly eventual if there is anm 
in N for which, for each word w in X*+, either Sp(w, L) C {n € N : n < m} 
or Sp(w, L) D {nE N:n > m}. 
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The language of Example 6.4 is uniformly eventual. The language of Exam- 
ple 6.5 is eventual but not uniformly eventual. Note that each cylindrical 
language is uniformly eventual (where any n in N may be taken as the uni- 
form bound). Note also that each language that is (uniformly) bounded above 
is (uniformly) eventual. Every uniformly eventual language is the symmetric 
difference of a cylindrical language and a language that is uniformly bounded 
above (Exercise 2, below in this section). Each noncounting language [31] is 
uniformly eventual as was pointed out in [15] where the concept of an eventual 
language was first introduced. 


Example 6.6 For L = aa V aaa V (aabaab)t v (ababab)t V b(a v b)*, 
each word that begins with a b has a full spectrum. Each primitive word 
that begins with an a has an empty spectrum except for the primitive words 
a, aab, and ab. Let B be any bijection for which B(a) = 1, B(aab) = 2, 
B(ab) = 3, and B(b(a v b)*N Q) = {n € N :n > 4}. The sketch provided 
by such a B gives a white left quadrant and a right quadrant that is black 
except for three columns. The column above a reads: white, black, black, and 
white thereafter. The columns above aab and ab are both intermittent with 
the first having period two and the second having period three. Therefore 
Su(L) = {a, aab, ab} U (Q N (b=*)). 


Definition 6.4 A language L is almost cylindrical (respectively, almost 
bounded above, almost uniformly bounded above, almost eventual, almost 
uniformly eventual) if it is the union of a language with finite support and a 
language that is cylindrical (respectively, bounded above, uniformly bounded 
above, eventual, uniformly eventual). 


The language of Example 6.6 is almost cylindrical and therefore also almost 
uniformly eventual. The uniformly eventual language of Example 6.4 is almost 
cylindrical. The language of Exercise 3 in this section below, is uniformly 
eventual, almost uniformly bounded above, and also almost cylindrical. The 
union of the languages of Exercises 3 and 4 in this section below is uniformly 
eventual and almost uniformly bounded above but not almost cylindrical. The 
union of the languages of Examples 6.5 and 6.6 is almost eventual, but not 
almost uniformly eventual. 

John Harrison provided the first application of the concept of an almost 
cylindrical language in [14]. 

If humor can be tolerated, we may say that the freedom we allow in choos- 
ing the bijections B for determining our language sketches can be supported 
with the slogans: “All Primitives Were Created Equal”, “End Domination by 
Alphabetical Symbols”, and “Power to the Primitives!” 
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Exercises 


(1) Regular languages that have a given property often have the uniform version 
of the property: 

(a) Show that every regular language that is bounded above is uniformly 
bounded above. 

(b) Show that every regular language that is almost eventual is uniformly 
almost eventual. (Both parts of this exercise may be easier after reading 
Section 6.10.) 

(2) Show that each uniformly eventual language L is the symmetric differ- 
ence of a cylindrical language and a language that is uniformly bounded 
above. 

(3) Let L =a v aaa V (ab)* v b*. Describe the spectrum of each word in Q. 
Find P(L) and Su(L). Choose a bijection B : Q —> Z which will provide 
a coherent sketch of L. Describe this sketch. 

(4) Let L = {a"b" : n € N}. Describe the spectrum of each word in Q. Find 
P(L) and Su(Z). Choose a bijection B : Q —> Z which will provide a 
coherent sketch of L. Describe this sketch. 

(5) Let L = (£ £)™. Describe the spectrum of each word in E. State whether 
each spectrum is empty, finite, cofinite, full, or intermittent. Find P(L) and 
Su(L). Choose a bijection B : Q —> Z which will provide a coherent sketch 
of L. Describe this sketch. 

(6) Let L = (abtabt)*. Find P(L) and Su(L). Choose a bijection B : Q > Z 
which will provide a coherent sketch of L. Describe this sketch. 

(T) Let L = ((ab+)®)*. Find P(L) and Su(L). Choose a bijection B : Q > Z 
which will provide a coherent sketch of L. Describe this sketch. 


6.6 The sketch parameters of a language 


Each sketch of a language L in E" is given by a sketch function S that is 
determined entirely by L and the choice of a bijection B : Q —> Z. Given two 
sketches of the same language L, each can be obtained from the other by an 
appropriate permutation of columns appearing in the sketches. Mathematically, 
distinguishing between different sketches of the same language L is rather 
artificial. The distinctions have been made because we prefer the more visu- 
ally coherent sketches to those that are less visually coherent. The class of all 
sketches of a given language is determined by any one of its members. Observe 
that the sketches of a language L are determined by what we call the sketch 
parameters of L that we define as follows: There is one sketch parameter for 
each spectral class C that contains at least one primitive word. The parameter 
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associated with such a C is the ordered pair consisting of the spectrum of 
any primitive word q in C and the cardinal number of C N Q. This sketch 
parameter is therefore the ordered pair (Sp(q), K) where q is in C N Q and K 
is the cardinal number of C N Q. In the discussion of the examples that follows, 
the cardinal number of N, i.e. the denumerable infinite cardinal, is denoted by 
the symbol oo. 

For Example 6.1 of Section 6.5, there are only two sketch parameters, 
(N, oo) and (Ø, co). For Example 6.2 of the same section, there are four 
sketch parameters, ({2, 3, 4, 6}, 1), ({3, 4}, 1), {3}, 1), and (Ø, oo). Example 6.3 
has parameters (Ø, co) and, for each n in N, ({m:m <n}, 1). Example 6.4 
has parameters ({1, 3, 5}, 1), (V,co), and (Ø, oo). Example 6.5 has parame- 
ters as follows: for each m in N with m odd, ({n € N : m > n}, 1); for each 
m in N with m even, ({n € N :m <n}, 1); and finally (Ø, co). Example 6.6 
has five parameters ({2, 3}, 1), ({2n:n € N}, 1), ({(3n:n € N}, 1), (VN, œœ), 
and (Ø, oo). 

We say that two languages are sketch equivalent if they can be represented 
by acommon sketch. For example, the context-free language L of Example 6.1 
is sketch equivalent to the regular language b(a v b)* since each can be repre- 
sented by a sketch that has a black left quadrant and a white right quadrant. Sim- 
ilarly the context-free language of Exercise 4 of Section 6.5 is sketch equivalent 
to the regular language ba* since each can be represented by a sketch that has a 
white left quadrant and a right quadrant that is white except for one horizontal 
black stripe at n = 1. Since the sketch parameters of a language determine the 
class of all possible sketches of a language, two languages are sketch equiva- 
lent if and only if they have the same sketch parameters. Consequently if L and 
L’ are languages for which the sketch parameters can be determined, then one 
may be able to decide whether L and L’ are sketch equivalent by comparing 
the sketch parameters of L and L’. This will certainly be the case if one of the 
languages has only finitely many sketch parameters. In Section 6.10, it is shown 
that every regular language has only finitely many sketch parameters and that 
they can be calculated. 


Open Ended Exercise Investigate the sketch parameters of QQ = {pq : 
Pq € Q}. 


Open Ended Exercise Which sets of sketch parameters can occur as the set 
of sketch parameters of a language L? This question becomes more interesting 
when L is required to be regular. The regular case might be considered again 
after reading one or more of the remaining sections. 
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6.7 Flag languages 


Each language L is recognized by its intrinsic automaton M(L). The concept of 
the recognition of a language by an automaton is thoroughly classical, at least 
for the regular languages. A concise presentation for arbitrary languages has 
been included in Chapter 3. In this chapter we apply M(L) only to the study 
of the spectra of regular languages, although applications may be possible in 
additional contexts. The notation of Chapter 3 is used to give a perfectly explicit 
discussion of the sketches of regular languages. 

Assume now that L is a regular language in &* and that its recognizing 
automaton M(L) has m states. With each word w in X* we associate a finite 
sequence of states of M(L) in the following way: Consider the infinite sequence 
of states, {[w”]: a nonnegative integer}. Since M(L) has only m states, there 
is a least nonnegative integer i for which there is a positive integer j for which 
[w'] = [w'*/]. Let r be the least positive integer for which [w'] = [w'*"]. We 
call the sequence {[w”]:0 < n < i +r} the flag F(w) of the word w. The 
length of F(w) isi +r. Since M(L) has only m states, the maximum length 
of the flag of any word is m. The collection of distinct flags {F (w) : w € =T} 
associated with a regular language L is necessarily finite. By a flag F of the 
language L we mean a sequence of states that constitutes the flag, relative to 
L, of some word in w in E+. With each flag F of L we associate the language 
I(F) = {we x* : F(w) = F}. We call I(F) the language of the flag F. For 
each flag F = {sj : 0 < j < k}, where the s; denote the states in F, we have 


I(F) = (MLG;j, 841) :0< jf <k-1} 
J 


where each L(s;, sj+1) is the language that consists of all words x in &* for 
which s;x = sj+1. Since each of the languages L(s;, s;+1) 1s regular, each flag 
language is regular. The great value of the flag languages, for regular L, is that 
they constitute a finite partition of &+ into equivalence classes each of which is 
a nonempty regular language. The flag partition of ©* into the flag languages 
determined by L is denoted P’(L). 


Open Ended Exercise In the theory of Abelian Groups the concept of torsion 
plays a fundamental role [8]. Can this suggest a worthwhile concept of torsion 
for language theory? A first attempt might begin with the tentative definition: A 
word w in X* is a torsion word with respect to a language L if the flag of w 
in M(L) is finite. If this definition is used then, for each regular L, all words in 
At would be torsion words with respect to L. The torsion words with respect 
to the context-free language L = {w in X* : a and b occur equally often in w} 
would be the words in {uv : weX*, veL}. 
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6.8 Additional tools from word combinatorics 


This section contains three additional propositions on word combinatorics that 
are needed for the algorithmics of the next section. Two words x and y are said to 
be conjugates of one another if they possess factorizations of the form x = uv 
and y = vu. From the next proposition it follows that conjugates have the 
same exponent, which includes the information that the conjugates of primitive 
words are primitive. This last fact, that conjugates of primitives are primitive, 
is applied many times in Section 6.9. 


Proposition 6.3 [fuv = p” then vu = q” with q a conjugate of p. 


Proof Since uv = p”, we may assume that p = uv’ where u = p'u” and 
v =v’ p/ with i and j nonnegative integers for which i + j = n — 1. For q = 
v'u” we have 


q” = (vu) = v' (u"v) lu" = vp" yl = v' pi p'u" = vu. 














Lemma 6.1 Letv be a word for which vv = xvy with x and y nonnull, then 
v = xy = yx. 


Proof Since |v| = |x| + |y| and v has x as a prefix and y as a suffix, v = xy. 
Then vv = xvy gives xyxy = xxyy and by cancellation yx = xy. 














Proposition 6.4 If u! and v, with i, j > 2, have a common prefix of length 
|u| + |v| then u and v are powers of a common word. 


Proof By the symmetry of u and v in the hypothesis, we may assume |u| > |v]. 
Then v is a prefix of u and u = v”x where u/v = (n, x). The v that occurs as 
the prefix of the second u, in the series of us concatenated to form u', occurs 
also as a factor of the product of the two vs that occur as the (n + 1)st and 
the (n + 2)nd vs concatenated to form v/. This provides a factorization of the 
form vv = xvy. By Lemma 6.1 and Proposition 6.1, x and y are powers of a 
common word and therefore so are v = xy and u = (xy)"x. 














Proposition 6.5 Letu and v be words that are not powers of a common word. 


For eachn in N either u"v is primitive or u"*! 


set Q N u*v is infinite. 


v is primitive. Consequently the 


Proof If both u”v and u"*'v fail to be primitive then, by Proposition 6.3, 
the conjugate u”vu of u”tl!v also fails to be primitive and we have 
u"v = p andu"vu = q/ with p, q primitive andi, j > 2. Then p~ = u"vu"v 
and g/ =u"vu have the common initial segment u”vu which has length 
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Ju" vul = (1/2)|u" vu| +C1/2)|u"vul > (/2)\u"v| +1 /2)|u"vu| > |p| + lal. 

By Proposition 6.4, p and q are powers of a common word and, since they 
are primitive, p = q. We then have u”v = p’ and u"vu = p/, which gives the 
contradiction: u = p/~' and v = p'~"U—"), Finally, since at least one member 
of each pair from the infinite collection of pairs {u‘v, u«+!v} must be primitive, 
the set Q N u*v is infinite. 














Exercises 


(1) Provide an alternative proof of Proposition 6.2 using Proposition 6.4. 

(2) Provide an alternative proof of Proposition 6.2 using Lemma 6.1. 

(3) Let È be an alphabet containing the symbols a and b. Let u be any word in 
Et. Show that at least one of ua and ub must be primitive. 

(4) Let u and v be in At. Suppose that, for some n in N, no word in the set 
{u*v | k > n} is primitive. Prove that uv = vu. Can you prove this using 
only Lemma 6.2 without using either Proposition 6.4 or Proposition 6.5? 


6.9 Counting primitive words in a regular language 


In order to construct the sketch parameters of a language L we will need to 
determine the cardinal number of the set C N Q for each spectral class C of L. 
The conceptually simple instructions for finding the cardinal of each set LM Q 
for any regular language L are given next and followed by a justification that 
is a simplified version of a proposition provided by M. Ito, M. Katsura, H. J. 
Shyr and S. S. Yu in [25]. 


The Counting Procedure Let A be an alphabet with at least two symbols 
and let L be a regular language contained in A+. Let n > 2 be the number of 
states of the automaton M(L) that recognizes L and let B = 4n. Begin testing 
the primitivity of words in L of length < B. As the testing progresses maintain 
a list of all primitive words found thus far. If a primitive word p with |p| > n 
is encountered, STOP with the information that |L N Q| = oo. Otherwise 
continue the testing and the listing process for words in L of length < B until 
either a primitive word p with |p| > n is encountered, or all the words in L 
of length < B have been tested. If this procedure has not STOPPED with the 
information that |L N Q| = œ, then the final list of primitive words found is 
the complete list of all primitive words in L. Such a list will be finite and may be 
empty. 
This counting procedure is justified by the following result: 
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Theorem 6.1 Let & be an alphabet with at least two symbols and let L be 
a regular language contained in X*. Let n > 2 be the number of states of the 
automaton M(L) that recognizes L and let B = 4n. Then: (1) LN Q is empty 
if it contains no primitive word of length < B; (2) L N Q is infinite if it contains 
a primitive word of length > n; and (3) if LA Q is infinite then it contains a 
primitive word p with |p| < B. 


Proof Note first that (1) will follow immediately once (2) and (3) are proved: 
Suppose L N Q contains no primitive word of length < B. Then, if LN Q 
contained any primitive word at all, that word would have length > B. Then 
L N Q would be infinite by (2) and would contain, by (3), a primitive word p 
for which |p| < B contradicting the original supposition. Next we prove (2). 

We consider two distinct cases: Suppose first that for every state [u] in M(L), 
there is a word v in + for which [wv] is a final state. Since M(L) has only n 
states it follows that there is such a word v with |v| < n — 1. Let a and b be two 
distinct symbols in X. For every integer i > n — 1 there is a word v of length 
< n — 1 for which a‘bv is in L. Each such a‘bv is primitive and is therefore 
in Q N L. Consequently Q N L is infinite as was to be proved. Surprisingly, 
perhaps, for this case we have a stronger version of (3) since a word w = a"~!bu 
lies in Q N L and n < |w| < 2n — 1 < B. (Exercises 5 and 6 of Section 6.1 
contain related concepts.) 

Now suppose that M (L) has a state g for which [gv] is not final for any word 
vin E+. Such a state g is often called a dead state. Suppose that w is in L N Q 
and |w| =r > n. As w is read by M(L), a walk is made from the initial state 
to a final state and this walk enters r states after leaving the initial state. This 
walk does not enter g. Since this walk involves a sequence of r+ 1 >n+1 
states there must be a repetition of states among the last n states in the list. This 
gives a factorization w = uxv for which [u] = [ux] where both u and x are 
nonnull and ux*v C L. Since uxv = w is primitive, so is its conjugate xvu. 
Since xvu is primitive, x and vu cannot be powers of a common word. By 
Proposition 6.5 (Section 6.8) the set Q N x* vu is infinite. Since each word in 
ux*v is a conjugate of a word in xt vu, the set Q N uxt v is also infinite and 
since also ux*v C L, Q N L is infinite as was to be proved. 

Suppose now that |L N Q| = œ. Let z bea word of minimal lengthinL N Q. 
To conclude the proof it is only necessary to show that |z| < B: Suppose that 
|z| > B. Since B = 4n and M(L) has only n states, z possesses a factorization 
z = ux'xvy'yw for which: |ux’x| < 2n; |y'yv| < 2n; [u] = [ux’] = [ux’x]; 
and [ux'xv] = [ux’xvy'] = [ux’xvy’y]; and none of the words x’, x, y’, y, 
uvw is null. We are concerned with the relative lengths of the four words x’, x, 
y’, and y. It is sufficient to treat only the case in which: |x| < |x’|, Iyl < |y’I, 
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and |x’| < |y’|. Each of the seven other settings of the inequalities can be treated 
in an exactly analogous manner. (See Exercises | and 2 in this section, below.) 

Since uxvw and uxxvw are in L and are shorter than z, neither is in Q. 
Consequently, neither of their conjugates xvwu and xxvwu is in Q. From 
Proposition 6.5 it follows that rt(x) = rt(vwu). Since uxvy’ yw and uxxvy’ yw 
are in L and are shorter than z, neither is in Q. Consequently, neither of their 
conjugates xuvy’ ywu and xxvy'’ ywu isin Q. From Proposition 6.5 it follows that 
rt(x) =rt(vy’ ywu). Since ux'vw and ux'x’vw are in L and are shorter than z, 
neither is in Q. Consequently, neither of their conjugates x/vwu and x/x'uwu 
is in Q. From Proposition 6.5, it follows that rt(x’) = rt(vwu). We now have 
rt(x’) = rt(x) = rt(vy’ywu). Consequently the word (x')(x)(vy’ywu) is not 
primitive, being in fact a power of rt(x). Since z = ux'xvy'yw is a conjugate 
of x’xvy’ywu it cannot be primitive either. This contradiction confirms that the 
shortest word in L N Q has length < B. 














Exercises 


(1) Carry out the proof in the final two paragraphs of Theorem 6.1 above using 
the settings: |x| < |x’|, Iy] < |y’|, and |y’| < |x’. 

(2) Carry out the proof in the final two paragraphs of Theorem 6.1 above using 
the settings: |x’| < |x|, |y| < |y’|, and |y’| < |x|. 

(3) Study the proof of Theorem 6.1 above to see if the given proof will hold if 
you replace B = 4n by B = 4n — 1. Can you reduce B any further without 
some basic additional insight? 


Remark 6.1 The value of B can be reduced a good deal in Theorem 6.1 
and in the resulting Counting Procedure using more powerful tools from word 
combinatorics. This is confirmed for B = 3n — 3 in [25] and later for B = 
(1/2)(Sn — 9) by M. Katsura and S. S. Yu. See also [13]. 


6.10 Algorithmic sketching of regular languages 


The spectrum of any word w in X* relative to a regular language L can be read 
from the flag of w. This is merely a matter of noting which of the states in the 
flag of w is a final state of M(L). Thus words having the same flag have the 
same spectrum. There are several absolutely fundamental consequences of this 
fact: (1) The flag partition P’ of X* refines the spectral partition P; (2) since a 
regular language has only finitely many flags it has only finitely many distinct 
spectra; and (3) since each spectral class is the union of (a finite number) of flag 
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languages (each of which is regular) the spectral classes of a regular language 
are regular. Thus both P’(L) and P(L) are finite partitions of + into regular 
sets. 


Theorem 6.2 Each regular language has only finitely many sketch parameters 
and these parameters are algorithmically computable. 


Proof Given a regular language L, construct M(L). Let m be the number of 
states of M(L). Only finitely many sequences of states F = {s; : 0< j < k} 
with k < m could possibly occur as flags of words in £+. For each such sequence 
F construct the intersection 7 = N{L(s;, Sj+1):0 < j <k — 1}, where each 
L(s;, sj+1) is the language that consists of all words x in X*+ for which sjx = 
sj41. If I is empty then F is not the flag of any word. If J is not empty then 
F is the flag of each word w in J and consequently we have I(F) = I. At 
this point we have determined the partition P’(L) of =* into the flag languages 
determined by L. Note that each flag F determines the spectrum that is common 
to each word w in J(F) since Sp(w) = {n € N:[w”] is a final state of M(L)}. 

For each flag F associated with L, determine the spectrum of F and apply 
the Counting Procedure in Section 6.9 to determine the cardinal number of 
Su(/(F)). Since distinct flags may have the same spectrum, flag languages that 
have a common spectrum must be collapsed together. Each spectral class C 
arises as the union of the flag languages it contains. Thus the spectral partition 
P(L) arises as the resulting coarsening of P’(L). Each sketch parameter arises 
from a spectral class C that contains a primitive word g and has the form 
(Sp(q), sum {|Su((F))| : I(F) € C}). 














An Example Computation Let L = (a V b)a*b*. One may verify that M(L) 
has four states: [A], [a] = [b] = [aa] = [ba], [ab] = [bb], [aba] = [bba] = 
[abab] = [baba] = “dead.” There are two final states: [a] and [ab] and 
two nonfinal states [A] and “dead.” There are six distinct flags: F(a): [A], 
[a] = [aa]; F(b) : [A], [b], [bb] = [bbb]; F(ab) : [A], [ab], [abab] = “dead;” 
F (bb) : [A], [bb] = [bbbb]; F(ba) : [A], [ba], [baba] = “dead;” and F(aba) : 
[A], [aba] = “dead.” The languages of these six flags are: L(F(a)) = at; 
L(F(b)) = bt; L(F(ab)) = abt v batb*; L(F(bb)) = bb; L(F(ba)) = 
bat; L(F(aba)) = (a V b)*(aba v bba)(a v b)*. We count the primitive words 
in each flag language: L(F (a)) contains one primitive word, namely a; L( F(b)) 
contains one, namely b; L(F(ab)) contains an infinite number of primitive 
words; L(F(bb)) contains no primitive words; L(F(ba)) contains infinitely 
many primitive words and so does L(F(aba)). The spectra of these flag 
languages of primitive words are: Sp(a) = N; Sp(b) = N; Sp(ab) = {1}; 
Sp(ba) = {1}; and Sp(aba) = Ø. The two flag languages, containing a and 
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b, respectively, have the same spectrum N. Thus the union of these two flag 
languages, which is at V bt, constitutes a spectral class. The two flag lan- 
guages, containing ab and ba, respectively, have the same spectrum {1}. 
Thus the union of these two flag languages, which is atbt v batb* v bat = 
atb* v batb*, constitutes a second spectral class. Finally, the flag language 
containing aba, namely (a V b)*(aba v bba)(a v b)*, constitutes the third 
spectral class of L. The first spectral class contains exactly two primitive words, 
namely, a and b. This gives the parameter: (N, 2). The second spectral class 
contains infinitely many primitive words. This gives the parameter: ({1}, oo). 
The third spectral class contains infinitely many primitive words which gives 
the parameter: (Ø, oo). 

Using the sketch parameters from the example above we provide a sketch of 
L: Let B: Q > Z be any bijection for which: B(a) = 1; B(b) = 2; B estab- 
lishes a one-to-one correspondence between the set of primitive words in the 
second (infinite) spectral class above and the set {z € Z : z < 0}; and B estab- 
lishes a one-to-one correspondence between the set of primitive words in the 
third (infinite) spectral class above and the set {z € Z : z > 3}. In this sketch 
of L, there is a vertical black stripe two units wide above a and b (i.e., x = 1 
and x = 2). The remainder of the right quadrant is white. The left quadrant is 
white except for one horizontal black stripe at the level n = 1. Although this 
language is not bounded above, it is almost uniformly bounded above. It is 
not almost cylindrical, but it is uniformly eventual. From this sketch of L all 
further sketches of L can be obtained by permuting the columns of the given 
sketch. 


Corollary 6.1 Sketch equivalence is decidable for each pair of regular lan- 
guages. Each of the ten language-theoretic properties defined in Section 6.5 is 
decidable for a regular language. 


Procedures These decisions can be made after computing the sketch param- 
eters of the languages in question. Two languages are sketch equivalent if and 
only if they have the same set of sketch parameters. The ten decisions con- 
cerning a regular language are easily made by an examination of the sketch 
parameters of the language. 

Which of the two partitions P(L) and P’(L) induced by a language L in the 
free semigroup ©* is more fundamental may not be clear at this time. In this 
chapter the detailed work has been done at the flag level, P’(L). A previous 
exposition [23] applied the algorithms given by M. Ito, H. J. Shyr, and S. S. Yu 
in their paper [25] to construct the sketch parameters of the regular languages. 
See also [13] for new elegant short proofs providing relevant tools. 


230 A visual approach to formal languages 


Open Ended Exercise Let X be an alphabet and let K be an arbitrary language 
contained in £+. If the sketch parameters of K are given, to what extent can 
they be used to decide whether there is a regular language L that has these 
sketch parameters? Special cases in which K is required to satisfy one or more 
of the ten language-theoretic properties defined in Section 6.5 might be treated. 


Open Ended Exercise Can additional classes of languages be found that 
allow their sketches to be determined? Note that for each context-free language 
L, w* N L is regular and recall Exercise 2 of Section 6.3. 


Open Ended Project The production of software for displaying sketches of 
languages is encouraged. 

An Aside to Readers Interested in Art The inspiration for the vision-based 
approach to languages came in part from admiration for the late paintings of Piet 
Mondrian and certain paintings by Barnet Newman. Note that one can sketch 
two or more languages on the same half plane and use distinct color pairs for 
distinct languages. 


7 


From biopolymers to formal language theory 


7.1 Introduction 


Living systems on our planet rely on the construction of long molecules by 
linking relatively small units into sequences where each pair of adjoining units 
is connected in a uniform manner. The units of polypeptides (proteins) are a 
set of twenty amino acids. These units are connected by the carboxyl group 
(COOH) of one unit being joined through the amino group (NH2) of the next 
unit, with a water molecule being deleted in the process. The units of RNA are 
a set of four ribonucleotides. These units are connected by the phosphate group 
(PO; attached at the 5’ carbon) of one unit being joined through replacement of 
the hydroxyl group (OH attached at the 3’ carbon) of the next unit, with a water 
molecule being deleted in the process. The units of single stranded DNA are a 
set of four deoxyribonucleotides with the joining process as in the case of RNA. 

Molecules lie in three-dimensional space, whereas words lie on a line. One 
may adopt the convention of listing the amino acids of a protein on a line with 
the free amino group on the left and the free carboxyl group on the right. For 
both single stranded RNA and DNA molecules one may adopt the convention 
of listing their units on a line with the phosphate at the left and the free hydroxy] 
group at the right. These conventions allow us to model (without ambiguity) 
these biopolymers as words over finite alphabets: a twenty letter alphabet of 
symbols that denote the twenty amino acids and two four letter alphabets of 
symbols denoting the four units for RNA and DNA, respectively. 

Within a decade of the announcement in 1953 of the structure of DNA by 
Watson and Crick, mathematicians and scientists were suggesting that bridges 
be found between the study of the fundamental polymers of life and the mathe- 
matical theory of words over abstract alphabets. The biopolymers were modeled 
by words in a free monoid with word concatenation modeling the chemical end- 
to-end joining of biopolymers through deletion of water. To obtain nontrivial 
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results in this rarefied context some additional source of structure seems to 
be required. The Shannon information content of biopolymers has long been 
studied. The transcription of DNA into RNA and the translation of RNA into 
protein are easily viewed as actions of finite transducers. This chapter treats 
additions to formal language theory that have their source in the conceptual 
modeling of the actions of enzymatic processes on double stranded DNA. The 
modeling process has motivated new concepts, constructions, and results in the 
theory of formal languages and automata. The focus of this chapter is on these 
new concepts, rather than the associated biomolecular science. Discussion of 
the science that provoked these developments in formal language theory has 
been restricted to this section and Section 7.3, with Section 7.3 optional reading 
for the interested reader. 

Section 7.2 is an informal introduction to what are called splicing opera- 
tions using examples that may appear quite arbitrary at first reading. Those 
who read the optional Section 7.3 will find that the examples of Section 7.2 are 
abstractions of the “cut and paste” actions of commercially available enzymes 
operating on DNA molecules. Section 7.4 provides the definitions and con- 
structions required for a formal theory of splicing. In the remaining sections the 
deepest results relating the theory of splicing systems and the class of regular 
languages are treated. 

Although all of the motivating biomolecular examples given here involve 
double stranded DNA, splicing theory is potentially applicable to polypeptides, 
RNA, DNA (whether single stranded or double stranded) and any other poly- 
mers that may be viewed as strings of related units linked in a uniform manner. 
(The cytoskeletal filaments in eukaryotic cells provide several such examples.) 
Moreover, dsDNA frequently occurs, both in vivo and in vitro, in circular form. 
Linear and circular dsDNA molecules interact (inter-splice) in nature as illus- 
trated for ciliate genomes in [27] and [6]. Interactions between linear and cir- 
cular DNA have been discussed in an abstract splicing context in [18] and [33]. 
A review of in vitro solutions of standard combinatorial computations using the 
cut and paste operations discussed here appears in [20]. The intention of this 
chapter is to stimulate the creation of additional connections between formal 
language theory and the biomolecular sciences. 


7.2 Constructing new words by splicing together pairs of 
existing words 


Given an ordered pair of words over the alphabet {a,c, g,t}, for exam- 
ple u = ttttggaaccttt and v = tttggaacctttt, one can consider allowing the 
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construction of a new word from these two by “cutting” each, say between the 
two occurrences of the symbol a in the subsequence s = ggaacc that occurs in 
each of u and v, and then building a new string by “pasting” together (concate- 
nating) the left portion of the first string and the right portion of the second string. 
The cutting process applied to the ordered pair u and v gives the fragments: 
ttttgga, accttt and tttgga, acctttt. The pasting of the indicated fragments in 
the indicated order then gives the word x = ttttggaacctttt. In this way the 
ordered pair of words u, v has provided x. Note that the ordered pair v, u fol- 
lowing the same cut and paste operation produces y = tttggaaccttt. We say 
that we have spliced u, v producing x and we have spliced v, u producing y. 

An extensive literature has developed in which the generative power of such 
splicing operations on words has been investigated. Many carefully considered 
control structures have been studied that guide the splicing process. Numerous 
researchers have been able to demonstrate that, by applying various such control 
structures, they can provide universal (Turing equivalent) computational power 
based on splicing operations. We do not pursue this goal here; we stay in the 
realm of regular languages. The original motivation for the introduction of 
the splicing concept was the modeling of the cut and paste actions provided by 
sets of restriction enzymes acting on double stranded DNA (dsDNA) molecules. 
These enzymatic actions are fundamental tools of genetic engineering. Our goal 
is to show that the theory of regular languages provides a formalism through 
which the potential generative power of sets of restriction enzymes acting on 
dsDNA can be represented. Readers who have interdisciplinary inclinations can 
continue with studies of [22] and models of computation based on biochemistry 
[32]. In Section 7.3 a minimal discussion of DNA splicing is given to indicate 
the contact point between splicing as understood in formal language theory and 
in molecular biology. A reader who does not wish for additional motivation 
from the biomolecular sciences may skip Section 7.3 which is not required for 
an understanding of the discussions in later sections. 


7.3 The motivation from molecular biology 


A single stranded DNA (ssDNA) molecule can be viewed as a linear sequence of 
the four covalently bonded deoxyribonucleotides {A = adenine, C = cytosine, 
G = guanine, T = thymine}. For example: 


TTTTGGAACCTTT. 


A dsDNA molecule can be viewed as a linear sequence of hydrogen bonded 
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pairs where the hydrogen bonds are between the vertically displayed pairs: 


TTITTGGAACCTTT 
AAAACCTTGGAAA. 


It is adequate here to assume that A and T pair only with each other and C and 
G pair only with each other. Due to this so-called Watson—Crick pairing rule, 
when one strand of a dsDNA molecule is determined the other is also known. 
If (as above) one row is 


TTITTGGAACCTTT 
we know its companion row is 
AAAACCTTGGAAA. 


Consequently, we need to give only one of the two strands. For efficiency 
and convenience we will list only one row of each dsDNA molecule. To be 
certain not to confuse dsDNA and ssDNA, we will use lowercase a, c, g, t to 
denote the paired deoxyribonucleotides: 


A C G T 
T G C A 


respectively. 

Thus TTTTGGAACCTTT is an ssDNA, but ttttggaaccttt is a dsDNA having 
as one of its strands the ssDNA, TTTTGGAACCTTT. 

There are over 200 commercially available restriction enzymes that cut 
dsDNA molecules at specific subsequences (sites). The example given in Sec- 
tion 7.2 is, in fact, a representation of an actual enzymatic process. At an 
occurrence of the site ggaacc in a dsDNA molecule the enzyme Nla IV cuts 
the covalent bonds in each of the single strands that hold the middle a-a of 
the site together. When this cut is made in aqueous solution the left and right 
halves separate due to Brownian motion. The resulting freshly cut ends of the 
fragments can be again connected with restored covalent bonds if an enzyme 
called a ligase is present. Suppose now that we have a test tube which contains, 
dissolved in water (or more precisely, in an appropriate buffer solution), the 
dsDNA molecules u = ttttggaaccttt and v = tttggaacctttt and also Nla IV and 
a ligase. Then Nla IV will cut the two molecules u and v producing the four 
fragments ttttgga’, ‘accttt and tttgga’, ‘acctttt, where the symbols have been 
added to denote the freshly cut ends (technically, the phosphates attached at the 
5’-ends remain after the cutting and are required for future pasting). The ligase 
can now paste together the fragments ttttgga’ and ‘acctttt to yield the dsDNA 
molecule x = ttttggaacctttt. The ligase can also paste together the fragments 
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tttgga’ and ‘accttt to yield the dsDNA molecule y = tttggaaccttt. The molecules 
x and y are said to be recombinants of u and v. For completeness we mention 
that the ligase also has the potential for reconstructing the original molecules u 
and v from the four fragments. If we ignore any remaining fragments that have 
freshly cut ends, then we may say that the “language” of all possible molecules 
that can arise in our test tube consists of the molecular varieties u, v, x, and 
y. Some significant details concerning DNA molecules have been suppressed 
above. The interested reader can find these details treated in the following exer- 
cise and the references. 


Exercise 


The two ends of an ssDNA molecule exhibit distinct structures. At one end a 
methyl group protrudes which may have an attached phosphate group. This end 
is referred to as the 5’ end since the carbon atom of the methyl group is counted 
as the 5’ carbon of the sugar substructure to which it belongs. At the other end 
a hydroxyl is attached at the 3’ carbon of the sugar substructure to which it 
belongs. This end is referred to as the 3’ end of the molecule. In modeling one 
must either label the ends or adopt a convention that allows the labels to be 
known otherwise. The ssDNA molecules 5’-ACTTGC-3’ and 3’-ACTTGC-5’ 
are not representations of the same molecule. For dsDNA one must understand 
that the two strands of the molecule always have opposite 5' — 3’ orientation. 
For convenience and concision we use the convention illustrated here. When, for 
example, acttgc is used to represent a dsDNA molecule it must be understood 
that this molecule has as one strand 5'-ACTTGC-3’ and consequently that the 
dsDNA molecule when fully spelled out is: 


5'-ACTTGC-3' 
3’-TGAACG-5’ 


(a) Write 3’-ACTTGC-5’ with the 5’ end on the left (and the 3’ on the right). 

(b) Write a lowercase representation for the dsDNA molecule that has 3’- 
ACTTGC-5’ as one of its strands. Is there a second lowercase represen- 
tation? Is there a third? 

(c) Which pairs of words, when regarded as models of dsDNA molecules, 
denote the same molecules: acttgc, cgttca, gcaagt, tgaacg, aaattt, tttaaa. 

(d) Verify that each dsDNA molecule, when denoted using the alphabet {a,c,g,t} 
and the conventions established here, has either exactly two distinct repre- 
sentations or only one representation. Give examples of each type. Those 
having only one are said to possess dyadic symmetry. (That dsDNA 
molecules may possess two distinct word representations creates only a 
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slight nuisance when constructing splicing models as explained in [17] [21] 
and [22].) 


7.4 Splicing rules, schemes, systems, and languages 


The previous sections have been written in an informal manner, possibly allow- 
ing ambiguity between molecules and the words used to represent them. The 
remainder of this chapter deals specifically with words in a free monoid. (How- 
ever, all results in the chapter have meaningful interpretations for enzymes 
acting on dsDNA.) 

Let È be a finite set to be used as an alphabet. Let &* be the set of all strings 
over &. By a language we mean a subset of X*. A splicing rule is an element 
r = (u, u’, v', v) of the product set 


Sa x x x *. 


The action of the rule r on a language L defines the language r(L) = {xuvy 
in &* : L contains strings xuu’q and pv'vy for some x, q, p, and y in D*}. 
For each set, R, of splicing rules we extend the definition of r(L) by defining 
R(L) = U{r(L): r € R}. A ruler respects the language L if r(L) is contained 
in L and a set R of rules respects L if R(L) is contained in L. By the radius of 
a splicing rule (u, u’, v’, v) we mean the maximum of the lengths of the strings 
u, u', v’, v. 


Definition 7.1 A splicing scheme is a pair o = (X, R), where & is a finite 
alphabet and R is a finite set of splicing rules. For each language L and 
each nonnegative integer n, we define o"(L) inductively: o°(L) = L and, 
for each nonnegative integer k, aot! (L) = o (L)U R(o*(L)). We then define 
o*(L) = U{o"(L) : n > 0}. A splicing system is a pair (o, 1), where o is a splic- 
ing scheme and I is a finite initial language contained in X*. The language 
generated by (o, I) is L(o, I) = o* (1). A language L is a splicing language if 
L = L(o, I) for some splicing system (o, I). 


Example 7.1 Let = {a,c, g, t}. Letr = (u, u', v’, v) where the four words 
u, u', v’, v in X* appearing in the rule r are u = v’ = gga and u’ = v = acc. 
Let R = {r}. This gives the splicing scheme 


o = (È, R) = ({a, c, g, t}, {((gga, acc, gga, acc)}). 
Let 


I = {ttttggaaccttt, tttggaacctttt}. 
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Observe that r applied to the ordered pair 
(ttttggaaccttt, tttggaacctttt) 

of words in J gives the word ttttggaacctttt, and r applied to the ordered pair 
(tttggaacctttt, ttttggaaccttt) 


of words in J gives tttggaaccttt. The less interesting actions of r on J must 
be recognized: When r acts on ordered pairs in the “diagonal” of I x J, for 
example on 


(ttttggaaccttt, ttttggaaccttt) 


the result is merely ttttggaaccttt which appeared as each coordinate of the 
pair. Here we have 


P1=l= {ttttggaaccttt, tttggaacctttt} 
and 


oD = o (1) U R(T) 
= I U({ttttggaacctttt, tttggaaccttt, ttttggaaccttt, tttggaacctttt} 


equals 
{ttttggaacctttt, tttggaaccttt, ttttggaaccttt, tttggaacctttt}. 


Notice that R respects o!(1) and consequently o°(1) = o (1). Then also 
o3(1)) = 0°(1) = o! (I) and in fact o*(1) = o !(I). Thus L(o, I) is the finite 
language 


o*(1) = {ttttggaacctttt, tttggaaccttt, ttttggaaccttt, tttggaacctttt}. 


This example connects the formal definitions of splicing systems and languages 
with the less formal introductory remarks of Sections 7.1 and 7.2. 


Example 7.2 Let © = {a,c, g,t}. Let r = (c, cccgg, c, cccgg), R = {r}, 
and let J contain only one word of length 30, 


I = {aaaaaaccccg gaaaaaaccccggaaaaaa}. 
The rule can be applied to the ordered pair 
(a°cccegga’ccccgga’, a°ccccgga®ccccgga®) 


with cuts made using the right occurrence of ccccgg in the first coordinate and 
the left occurrence of ccccgg in the second coordinate. This gives the word of 
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length 42: 
a°ceccgga®cceccgga’ccccgga’. 


The rule can be also applied to the ordered pair using the left occurrence of 
ccccgg in the first coordinate and the right occurrence of ccccgg in the second 
coordinate. This gives the word of length 18, a®ccccgga®. Thus 


6 6 


o!(1)={a°cccegga®, aecccegga’cccegga®, a°ccccgga®ccccgga’ccccgga’}. 


Continuing with similar considerations one finds that L(o, I) = o*(J) is the 
infinite regular language 


a°ccccgga®(ccccgga®)*. 

Example 7.3 We may interpret the 30 symbol word given in Example 7.2 
as a model of a dsDNA molecule as indicated in Section 7.3. The rule r of 
Example 7.2 represents the cut and paste activity of the restriction enzyme 
BsaJ I accompanied by a ligase. With these understandings the language 


L(o, T) = a’cecegga®(ccccgga®)* 


obtained in Example 7.2 is a model of the set of all dsDNA molecules (having no 
freshly cut ends) that can potentially arise in a test tube containing BsaJ I, a lig- 
ase, and (sufficiently many) dsDNA molecules of model a°ccccgga®ccccgga®. 
The ability to make assertions as in the preceding sentence motivated the intro- 
duction of the splicing concept into formal language theory. 


Example 7.4 Let & = {a, b, c}. Let L be the regular language caba*b. Can 
we find a splicing system that generates L? Yes, this can be done very eas- 
ily by taking advantage of the fact that the symbol c occurs as the leftmost 
symbol of every word in L and occurs nowhere else in any word of L. Let 
r = (caba, a, cab, a) and let I = {cabb, cabab, cabaab}. Note that r allows 
the generation of cabaaab as follows: From the ordered pair (cabaab, cabaab), 
and the two distinct conceptual analyses caba/ab and cab/aab, the rule r gives 
the new word cabaaab. Then from the ordered pair (cabaab, cabaaab), and 
the analyses caba/ab and cab/aaab, the rule r gives the word cabaaaab. 
(Note that r provides a form of pumping.) Continuing in this way all words 
caba"b with n > 3 can be obtained. Since caba”b with O < n < 2 were given 
in J, we have, for R = {r} and o = (£, R), L(o, I) = caba*b as desired. In 
fact it has been shown [12] that for any regular language L’ over any alphabet 
X, by choosing a symbol not in &, say c, the language 


L=cL'={ew:weL} 
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is generated by a splicing system that can be specified very much as we have 
done for the language caba*b in this example. Thus informally speaking, each 
regular language is almost a splicing language. 


Example 7.5 Let & = {a, b}. The regular language L = (aa)* cannot be gen- 
erated by a splicing system. As the reader may verify, any finite set of rules that 
allows every word in L to be generated will also generate strings of odd length 
as well as the strings of even length. 


Example 7.6 The regular language L’ = a*ba*ba* cannot be generated by a 
finite set of rules either: For any nonnegative integer n, 


R, = {(A, ba"b, à, aba"b), (ba" ba, à, ba"b, à)} 
and 
I, = {aba"b, ba"b, ba" ba} 


generate a*ba" ba*. 

Consequently, for any finite subset F of nonnegative integers, R = 
U{R,:ne F} and I=Uf{I,:neéF} generate the language L” = 
U{a*ba"ba* :n € F}. However, as the reader may verify, any finite set 
of rules and finite initial language that generate all words in a*ba*ba* will 
also generate words in which the symbol b occurs more than twice. Thus there 
are regular languages that are not splicing languages. 


Exercises 


(1) Let L be any finite language over any alphabet X. Specify a splicing system 
that generates L. (Hint: The set R of rules can be empty.) 

(2) Let & = {a, b}. Find three splicing systems that generate, respectively, 
(i) L = b(aa)*; (ii) L = &*; and (iii) L = ba*ba*. 

(3) Let & = {a, b, c}. Find three splicing systems that generate, respectively, 
(i) L = ab*abc; (ii) L = ab*cab; (iii) a*ba*ca*ba’*. 


7.5 Every splicing language is a regular language 


Splicing languages were introduced in published form for the first time in 1987 
[17]. Fortunately K. Culik II and T. Harju quickly announced in 1989 [4], [5] that 
all splicing languages are regular. A second exposition of the regularity result 
was given by D. Pixton in 1996. This exposition provided, for each splicing 
system, an explicit construction of a finite automaton that was concisely proved, 
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using an insightfully constructed inductive set, to recognize the language gen- 
erated by the splicing system. In 1989, R. Gatterdam observed [11] that not 
all regular languages are splicing languages. So, which regular languages are 
splicing languages? We would like to have a theorem that characterizes the class 
of splicing languages in terms of previously known classes of languages. As 
yet we have no such characterization. In [17] it was observed that a language L 
is a splicing language if there is a positive integer n such that uxq is always in 
L whenever x has length n and both uxv and pxq are in L. These languages, 
which were analyzed rather thoroughly in [19], constitute a highly restricted 
subclass of the splicing languages, enlargements of which have been studied 
extensively in [12]. The interested reader is urged to study very carefully Pix- 
ton’s proof of regularity which is given in [33], [21], [32], [22] and broadly 
generalized in [34]. 

With no crisp characterization of the class of splicing languages found, 
concern turned to the search for an algorithm for deciding whether a given 
regular language can be generated by a splicing system. There is, of course, an 
easily described procedure that is guaranteed to discover that a regular language 
L C &* is a splicing language if L is a splicing language: For each positive 
integer n, for each set R of rules of radius < n, and for each subset J of L 
consisting of strings of length < n, decide whether L(o, J) = L, where o = 
(2, R). Since both L and each such L(o, J) are regular, all these steps can be 
carried out. The procedure terminates when a system L(o, I) is found, but fails 
to terminate when L is not a splicing language. From this triviality, however, it 
follows that an algorithm will become available immediately if, for each regular 
language L, a bound, N(L), can be calculated for which it can be asserted that 
L cannot be a splicing language unless there is a splicing system having rules 
of radius < N(ZL) and initial strings of length < N(ZL). (Recall from Section 7.2 
that the radius of a rule r = (u, u’, v’, v) is the length of the longest of the four 
words u, u’, v’, and v.) The determination of the bound N(L) is a conceptual 
victory for the concept of the syntactic monoid of a language because it allows 
the concise statement of an adequate bound N(L), given in Section 7.7. It also 
provides a valuable tool stated in the heading of Section 7.6. 


Exercises 


(1) Let & = {a, b}. Find a regular expression that represents L(o, I) where 
o = (Ł, R), R = {r},r = (b, b,a, b), and I = {abba}. 

(2) Let & = {a, b, c}. Find a regular expression that represents L(o, I) where 
o = (È, R), R = {r1, r2}, r1 = (ab, c, cb, a), r2 = (cb, a, ab, c), and I = 
{abc, cba}. 


7.6 Syntactic monoids and rules respecting L 241 


(3) Same as Exercise 2 with one new rule added: r3 = (cbc, à, cb, c) so that 
R = {r1, r2, r3}. 


7.6 The syntactic monoid of a regular language L allows 
an effective determination of the set of all splicing 
rules that respect L 


First we show how to decide whether a given splicing rule r respects a given 
regular language L C &*: Let M be the minimal automaton recognizing L. Let 
S be the set of all states of M and let F be the set of final states. For each state s 
in S and each word w in &* we denote by sw the state of M arrived at after w is 
read from state s. Note that the rule r = (u, u’, v’, v) respects L if and only if, 
for each ordered pair of states p, q of M, for which {x € X* : puu'x € F} and 
{y € X* : qu'vy € F}arenot empty, {z € D* : qv'vz € F} C {z: puvz € F}. 
The emptiness conditions and the inclusion are decidable since each of the four 
sets is regular. 

Next we show how to specify all of the rules that respect the regular lan- 
guage L in a manner that requires that the procedure above be used on only 
a finite number of rules. Recall that the syntactic congruence relation, C, in 
&X* is defined by setting u'Cu if and only if, for every pair of strings x and 
y € &*, either both xu’y and xuy are in L or neither is in L. Since L is 
regular, the number of C—congruence classes is a positive integer which we 
denote n(L). Then there are precisely [n(L)}* ordered quadruples of congru- 
ence classes. Let (W, X, Y, Z) be an ordered quadruple of congruence classes. 
Let r = (w, x, y, z) andr’ = (w’, x’, y’, z’) be two rules in W x X x Y x Z. 
We verify that r respects L if and only if r’ respects L. By the symmetry of 
the roles of r and r’ in the hypothesis, we need only assume that r respects 
L and verify that then r’ must respect L. Suppose that r respects L and that 
the pair uw’x’v, sy'z't is in L. We need only show that uw’z’t is in L: From 
w’'Cw we have uwx'v is in L and from x’Cx we then have uwxv in L. From 
y'Cy and z'Cz it follows that syzt is in L. Since r respects L and the pair 
uwxv, syzt isin L, we have uwzt in L. From w’Cw and z’Cz it follows that 
uw’z’t is in L, as required. Thus, to specify all the rules that respect L, we 
construct the [n(L)]* quadruples of syntactic classes determined in 4* by L 
and, from each such quadruple (W, X, Y, Z), we choose one word from each 
class to obtain one rule (w, x, y, z) and then decide whether it respects L. If 
it does then every rule in W x X x Y x Z respects L. If it does not respect L 
then no rule in W x X x Y x Z respects L. This discussion has justified the 
following: 
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Proposition 7.1 Let L be a regular language. The set of rules that respect L 
has the form 


U{W; x Xix Y; x Z;:1<i<m)} 
where m is a nonnegative integer and each of the sets 
W, XY Zid <i < m) 
is an element of the syntactic monoid of L. 


Since each syntactic class of a regular language L is itself a regular language, 
one can list all the strings of length at most k in the class. Consequently when 
the representation in the proposition has been constructed, the set of all rules 
of radius at most k that preserve L can be listed with no additional testing: For 
each of the sets 


Wi x Xi x Y; xZ <i <m) 
in the representation, list all of the rules (w, x, y, z) in 
W; x Xi xY; x Zi 


of radius at most k. In order to create such a list without using the syntactic 
monoid it would be necessary to list every rule of radius at most k in all of 
[x*]* and test every such rule individually to decide if it preserves L. 


Exercises 


(1) Let È = {a} and L = (aa)*. Construct the syntactic monoid of L. 
(2) Let È = {a, b} and L = a*ba*ba*. Construct the syntactic monoid of L. 
(3) Construct the syntactic monoid of the language in Exercise 3 of Section 7.5. 


7.7 Itis algorithmically decidable whether a given 
regular language is a reflexive splicing language 


A rule set R is reflexive if, for each rule (u, u’, v’, v) in R, the rules (u, u’, u, u’) 
and (v’, v, v’, v) are also in R. When R is reflexive we say the same of any 
scheme or system having R as its rule set. In fact, splicing systems that model 
the cut and paste action of restriction enzymes and a ligase are necessarily 
reflexive. Consequently, from a modeling perspective, it is the reflexive splicing 
systems that are of prime interest. 
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Section 7.6 provides the tools to construct, for each regular language L and 
each positive integer k, the following finite reflexive set Tẹ of splicing rules: 


Tą = {(u, u’, v', v) : the radius of (u, u’, v’, v) < k and each of the three rules 


(u, u', v', v), (u, u', u, u’), and (v’, v, v’, v) respects L}. 


Recall that 7,(L) = U{r (L) : r € Tk}, which is regular since 7; is finite and, 
since L is regular, each r(L) is regular (as confirmed in Exercise 5 of this 
section). Consequently L \ 7;,(L) is also regular. 


Theorem 7.1 (Pixton and Goode) A regular language Lis a reflexive splicing 
language if and only if L\T,(L) is finite where k = 2(n(LY* + 1) and n(L) is 
the cardinal number of the syntactic monoid of L. 


Let L be a regular language. In Chapter 3 the syntactic monoid of L was 
defined in a way that allows n(L), and therefore also k, to be computed. Sec- 
tion 7.6 provides a procedure for computing the finite set T, from which the 
regular set 7,(L) can be computed and it can be decided whether L\7T,(L) is 
finite. Thus the theorem of Pixton and Goode provides an algorithm that allows 
one to decide whether any given regular language is a reflexive splicing lan- 
guage. It is tempting to suppose that when L\7;,(L) is finite it can serve as the 
set of initial words of a splicing system that generates L. Unfortunately this is 
not the case as shown in Exercise 3 of this section. Although the proof of the 
theorem is beyond the scope of this book, the decision procedure that it provides 
can, in principle, be carried out using the machinery this book has provided. 


Exercises 


(1) Let © = {a} and L = (aa)*. Compute n(L) and k for this language. 

(2) Let & = {a, b} and L = a*ba*ba*. Compute n(L) and k for this language. 

(3) Let È = {a, b} and L = &*. Note that uCv for every u, v in L and conse- 
quently the syntactic monoid of L is a singleton. 

(a) Compute n(L) and k. 

(b) Describe T in words. How many elements does T, contain? 

(c) Compute 7,(L) and L\T;,(L). 

(d) Conclude that, when the set L\7;,(L) is finite, it does not follow that 
L\T;(L) is adequate to serve as the set J of initial words of any splicing 
system (ø, I) for which L = L(o, I). (This is what makes the proof of 
Theorem 7.1 challenging.) 

(e) Without using Theorem 7.1, specify a set R of splicing rules and a 
set J of initial strings for which L = L(o, I) for o = (£, R). Hint: 
(A, A, A, A) is a splicing rule. 
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(4) Show that the following definition of a reflexive splicing language is equiv- 
alent to the definition given: A language L is a reflexive splicing language 
if L = L(o, I) for some splicing scheme o = (£, R) and, for each rule 
r = (u, u’, v', v) in R, the rules (u, u’, u, u’) and (v’, v, v’, v) respect L. 
(This alternative definition allows one to list fewer rules in the rule set 
specifying a reflexive splicing system.) 

(5) Let L C X* be a regular language. Let r = (u, uv’, v’, v) be a splicing rule 
with u, u’, v’, v in &*. Show that r(L) is regular. Hint: Use two copies, M 
and M’, of the minimal automaton M that recognizes L. Let the sets of 
states of these two automata be S and S’. Combine M and M’ into a single 
automaton M”, having state set S$ US’, by adding carefully chosen new 
edges that allow transitions from states in S to states in S’ having v as label. 
Choose the initial state i of M as the initial state of M” and choose the set 
F’ of final states of M’ as the set of final states of M”. 


Appendix A 
Cardinality 


Theorem A.1 Jf |S| < |T| and |T| < |S|, then there is a one-to-one corre- 
spondence between S and T, i.e. |S| = |T| 


Proof Assume f : S —> T and g : T — S are injective functions. For each 
s € S, we find g7! (s) if it exists. We then find f7'2-\(s) if it exists. Then find 
g`! f-'g—|(s) if it exists. We continue this process. There are three possible 
results: (1) The process continues indefinitely. (2) The process ends because 
for some s; in the process, there is no g—'(s;). (3) The process ends because 
for some t; in the process, there is no f—'(t;). Let S, be the elements of S for 
which the first result occurs. Let S2 be the elements of S for which the second 
result occurs. Let S3 be the elements of S for which the third result occurs. 
Obviously these sets are disjoint. Similarly form Tı, T2, and 73 as subsets of 
T. f is a one-to-one correspondence from Sj to Tı. f is also a one-to-one 
correspondence from Sz to T>. g7! is a one-to-one correspondence from $3 to 
Tz. Let 0 : S —> T be defined by 


O(s) = f(s)ifs € Sı 
= f(s)ifs eS 
=g l(s)ifs € S; 











0 is a one to one correspondence from S to T. 





Theorem A.2 For any set A, |A| < |P(A)|. 


Proof Certainly |A| < |P(A)| since for each element in a in A, {a} is in P(A). 
Assume |A| = |P(A)|. Then there is a one-to-one correspondence between 
|A| and |P(A)|. For a € A let d(a) be the element in P(A) paired with a. 
Some elements in A belong to the element in P(A) with which they are 
paired. For example, if a € A and ¢(a) = A in P(A), then a € ¢(a). How- 
ever, if a € A and g(a) = Ø, the empty set in P(A), certainly a ¢ (a). Let 
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W = {a:a € A a ¢ dia}. W € P(A), but no element in A can correspond 
to W, for if d(a) = W and a € W, then by definition of W, a ¢ (a) and 
a ¢ W. However, if (a) = W anda ¢ W, then a ¢ ¢(a) and by definition of 
W,a e W. Hence we have a contradiction if any element of A corresponds to 
W and there is no one-to-one correspondence between |A] and |P(A)|. 














This theorem shows us that, for any infinite set, there is another infinite set 
with greater cardinality. We shall not prove it here but the cardinality of the real 
numbers is equal to the cardinality of the power set of the set of integers. 


Appendix B 


Co-compactness Lemma 


Lemma B.1 (Co-compactness Lemma) Let A be a finite set and let 
{R; : i € I} be a family of retracts in A*. There is a finite subset F of I for 
which (\ Ri = () Ri. 
i¢F iel 

Proof We consider only the case for which there is a single key set K for 
which, for each i € I, K is a set of keys for the key code that generates R;. 
The general result then follows from the fact that there are only a finite number 
of subsets, hence of possible key sets of A. First we partition K into disjoint 
subsets K’ and K”. Let K’ = {a € K: for every finite subset J of I, a occurs 
in at least one word in (| R;. Let K” = K — K’. 


ieJ 
From the definition of K”, it follows that, for each a € K”, there is a finite 


subset F(a) of I for which (| R; contains no word in which a occurs. Let 
ieF(a) 
F” = |J F(a)a in K". The symbols in K that occur in words in (| R; are 
ack" ieF” 
precisely the symbols in K”. 


Define an equivalence relation ~ in the set () Rj: by R; ~ R; if for each 
iel 
a € K’, the generator of R; in which a occurs is identical with the generator of 
R; in which a occurs. In the next three paragraphs we show that there are only 
finitely many ~ equivalence classes. 

Choose an arbitrary index m € I. Let C be the key code that generates Rm. 
Let C’ be the subset of C consisting of those words with keys in K’. Let L 
be the length of the longest word in C’. Note that, for any word w € C* : (1) 
the number of symbols to the left of the first occurrence of a key symbol in w 
is less than or equal to L — 1; (2) the number of symbols occurring between 
two successive occurrences of keys is less than or equal to 2L — 2; and (3) the 
number of symbols to the right of the last occurrence of a key symbol in w is 
less than or equal to L — 1. 
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Next we establish that, for every j € J and every k € K’, the generator 
of R; in which k occurs has length at most 4L — 2. For such j and k we 
have: since G = F” U {m, j} is finite and k € K’, there is a word w € () R; 
in which k occurs. Let w = xodox1@1 . . .Xj—14jXjQj41Xi41 - - -Xn—14nXn ones 
the a;, 1 <i < n, are all the key occurrences in w. Hence, no key occurs in 
any of the x;, 1 <i <n. Note that all of the keys occurring in w must lie in 
K’. The word w can be segmented into code words belonging to Rm and it can 
also be segmented into code words belonging to R;. We have k = a; for some 
i, 1 <i <n. Note that, if the length of the code word belonging to R; in which 
k occurs were greater than or equal to 4L — 2, this would contradict one of 
(1),(2), or (3) of the final sentence of the previous paragraph. 

We have shown that there is a bound B(= 4L — 2) such that, for every j € I 
and k € K’, no code word of R; in which k occurs can have length greater than 
or equal to B. From the definition of the equivalence relation ~ we see that 
there are only finitely many ~ equivalence classes. 

Let F’ be a subset of J for which, for each i € J, there is a unique j € F’ 
for which R; ~ Rj. 

The statement of the lemma is true for F = F’ U F”. 
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